Download website from the WayBack Machine
Just noticed that I have this question from a few years ago still open. While I wasn't able to find a suitable option beyond a generic crawler at the time, multiple options have since popped up on sites like GitHub. While I haven't used any of them personally, I would like to document it here for those still searching for a way to do this.
An example is hartator/wayback-machine-downloader, which appears to be platform agnostic (a Ruby .gem). It describes how it works as follows:
It will download the last version of every file present on Wayback Machine to ./websites/example.com/. It will also re-create a directory structure and auto-create index.html pages to work seamlessly with Apache and Nginx. All files downloaded are the original ones and not Wayback Machine rewritten versions. This way, URLs and links structure are the same as before.
Hope that helps someone who has the same problem I did many years ago. Going to mark as solved with this, unless someone has a better answer.
Related videos on Youtube
Sanoo
User of many StackExchange sites, mainly Android Enthusiasts and Super User.
Updated on September 18, 2022Comments
-
Sanoo over 1 year
I found an excellent website on the WayBack machine which currently doesn't work and the domain is for sale. I wanted to use it offline. I tried using WinHTTrack, but it only saves the homepage, because of the structure of the WayBack Machine.
I am using Windows, and I would appreciate any help with helping me to download it.
Thanks.
-
Keltari over 9 yearsThe Internet Archive Wayback Machine at archive.org/web works just fine and is not for sale... where are you going?
-
InterLinked over 4 yearsI know you said you're using Windows (as am I), but if you have access to a Linux server, or maybe the Subsystem for Windows 10 - which I haven't use, maybe you could give wget a look. I've used it before to recursively download all the pages and files from a website
-
-
Sanoo almost 10 yearsIs J-Spider only available for Linux? Isn't there anything for Windows?t
-
Fazer87 almost 10 yearssourceforge.net/projects/openwebspider this works for windows (according to its reviews)
-
Sanoo almost 10 yearsI downloaded it and got the following error. + Reading openwebspider.conf...OK - Database1: hosts - Server1: localhost - Username Database1: user - Database2: spiderdb - Server2: localhost - Username Database2: user - Database3: temptables - Server3: localhost - Username Database3: user -================- Failed to connect to database:hosts Error: Can't connect to MySQL server on 'localhost' (10061)
-
Trect almost 4 yearsThis works like charm! Thanks yo