Download website from the WayBack Machine

mirroring httrack webarchive

5,779

Just noticed that I have this question from a few years ago still open. While I wasn't able to find a suitable option beyond a generic crawler at the time, multiple options have since popped up on sites like GitHub. While I haven't used any of them personally, I would like to document it here for those still searching for a way to do this.

An example is hartator/wayback-machine-downloader, which appears to be platform agnostic (a Ruby .gem). It describes how it works as follows:

It will download the last version of every file present on Wayback Machine to ./websites/example.com/. It will also re-create a directory structure and auto-create index.html pages to work seamlessly with Apache and Nginx. All files downloaded are the original ones and not Wayback Machine rewritten versions. This way, URLs and links structure are the same as before.

Hope that helps someone who has the same problem I did many years ago. Going to mark as solved with this, unless someone has a better answer.

5,779

Sanoo

User of many StackExchange sites, mainly Android Enthusiasts and Super User.

Updated on September 18, 2022

Comments

Sanoo over 1 year

I found an excellent website on the WayBack machine which currently doesn't work and the domain is for sale. I wanted to use it offline. I tried using WinHTTrack, but it only saves the homepage, because of the structure of the WayBack Machine.

I am using Windows, and I would appreciate any help with helping me to download it.

Thanks.
- Keltari over 9 years
  
  The Internet Archive Wayback Machine at archive.org/web works just fine and is not for sale... where are you going?
- InterLinked over 4 years
  
  I know you said you're using Windows (as am I), but if you have access to a Linux server, or maybe the Subsystem for Windows 10 - which I haven't use, maybe you could give wget a look. I've used it before to recursively download all the pages and files from a website
Sanoo almost 10 years

Is J-Spider only available for Linux? Isn't there anything for Windows?t
Fazer87 almost 10 years

sourceforge.net/projects/openwebspider this works for windows (according to its reviews)
Sanoo almost 10 years

I downloaded it and got the following error. + Reading openwebspider.conf...OK - Database1: hosts - Server1: localhost - Username Database1: user - Database2: spiderdb - Server2: localhost - Username Database2: user - Database3: temptables - Server3: localhost - Username Database3: user -================- Failed to connect to database:hosts Error: Can't connect to MySQL server on 'localhost' (10061)
Trect almost 4 years

This works like charm! Thanks yo