Download website from the WayBack Machine

5,779

Just noticed that I have this question from a few years ago still open. While I wasn't able to find a suitable option beyond a generic crawler at the time, multiple options have since popped up on sites like GitHub. While I haven't used any of them personally, I would like to document it here for those still searching for a way to do this.

An example is hartator/wayback-machine-downloader, which appears to be platform agnostic (a Ruby .gem). It describes how it works as follows:

It will download the last version of every file present on Wayback Machine to ./websites/example.com/. It will also re-create a directory structure and auto-create index.html pages to work seamlessly with Apache and Nginx. All files downloaded are the original ones and not Wayback Machine rewritten versions. This way, URLs and links structure are the same as before.

Hope that helps someone who has the same problem I did many years ago. Going to mark as solved with this, unless someone has a better answer.

Share:
5,779

Related videos on Youtube

Sanoo
Author by

Sanoo

User of many StackExchange sites, mainly Android Enthusiasts and Super User.

Updated on September 18, 2022

Comments

  • Sanoo
    Sanoo over 1 year

    I found an excellent website on the WayBack machine which currently doesn't work and the domain is for sale. I wanted to use it offline. I tried using WinHTTrack, but it only saves the homepage, because of the structure of the WayBack Machine.

    I am using Windows, and I would appreciate any help with helping me to download it.

    Thanks.

    • Keltari
      Keltari over 9 years
      The Internet Archive Wayback Machine at archive.org/web works just fine and is not for sale... where are you going?
    • InterLinked
      InterLinked over 4 years
      I know you said you're using Windows (as am I), but if you have access to a Linux server, or maybe the Subsystem for Windows 10 - which I haven't use, maybe you could give wget a look. I've used it before to recursively download all the pages and files from a website
  • Sanoo
    Sanoo almost 10 years
    Is J-Spider only available for Linux? Isn't there anything for Windows?t
  • Fazer87
    Fazer87 almost 10 years
    sourceforge.net/projects/openwebspider this works for windows (according to its reviews)
  • Sanoo
    Sanoo almost 10 years
    I downloaded it and got the following error. + Reading openwebspider.conf...OK - Database1: hosts - Server1: localhost - Username Database1: user - Database2: spiderdb - Server2: localhost - Username Database2: user - Database3: temptables - Server3: localhost - Username Database3: user -================- Failed to connect to database:hosts Error: Can't connect to MySQL server on 'localhost' (10061)
  • Trect
    Trect almost 4 years
    This works like charm! Thanks yo