How to save a web page snapshot with all its elements (css, js, images, ...) into one file
Solution 1
Solution 2
Use wget in terminal
wget -p -k http://www.example.com/
It'll make a clone of site frontend html, css, js, svg etc. But not in one file as asked. Rather, it'll recreate the whole folder structure
E.g. if folder structure of www.example.com
is as
/css/*
/js/*
/index.html
then it'll create the same structure locally.
Docs: https://www.gnu.org/software/wget/manual/wget.html
Solution 3
I think @reisio (+1) has you covered...
...But if only to plug a great free tool, I would point out the Firefox extension Save Complete, which does an admirable job of grabbing "complete" pages on an ad hoc basis. The output will be a single HTML file with an accompanying directory stuffed with all the resources - you can easily zip them up for archiving.
It's not without fault - I've had issues with corrupted .png
files lately on OSX, but I use it frequently for building mockups off of live pages and it's a huge time-saver. (Also of note, it hasn't been updated for FF 4 yet, and is the sole reason I rolled back to 3.6)
Vacilando
Updated on May 03, 2021Comments
-
Vacilando about 3 years
How is it possible to programmatically save a web page snapshot with all its elements (css, js, images, ...) into one file?
I need to archive some web pages regularly. However, just saving their HTML code is useless - not only because of images missing but esp. because the absence of CSS on today's pages can turn a web page into unrecognizable mess.
I remember the .mht format that worked like this, but that required manual saving, and it was just a feature of IE. I believe there is an open-source solution that can achieve this programmatically, but despite hours of searching I cannot find it on the web.
-
Christian about 13 yearsHow is this method automated, or even programmable?
-
peteorpeter about 13 yearsIt's much more automated than manually collecting all the resources and migrating the references, etc. See this caveat: "on an ad hoc basis"? I'm not claiming it's the perfect solution, but might be useful to people trying to achieve a similar, semi-automated result. Also, for the sake of argument, you could script FF to automate this further: macscripter.net/viewtopic.php?id=21304. (Do you think all potentially helpful, but imperfect, solutions should be -1'ed? I'm resisting the urge to down-vote your own imperfect, yet potentially helpful answer. Spirit foul.)
-
Christian about 13 yearsSemi perfect? It works, it's not browser dependent, and it's more automated than trying to script Firefox! Are we back to "viewable by Firefox only" era again, or something? My solution can be done with any language on any platform. Your solution seems to work on firefox on a mac only. Plus firing a browser just to do some text manipulation sounds ridiculously over-engineered.
-
peteorpeter about 13 yearsI'm not knocking your answer - for the record it sounds like the cleanest solution to the question asked. My hackles were raised by your attitude, not your knowledge.
-
Christian about 13 yearsYou could call it "overly defensive" if you want to.
-
Vacilando over 12 yearsWe look for a way to do this programmatically.
-
nest over 9 yearsIt doesn't download the javascript
-
reisio over 6 yearsThere isn't any JavaScript worth downloading that you wouldn't have loaded directly (& therefore saved directly). That said: You could do an ordinary httrack, without -%M, and then put that into an archive. With things like archivemount you can open them seemlessly, even though you don't need to. All easily scripted. Stack Overflow sucks.