How to download entire HTML of a webpage using javascript?
Solution 1
It should be possible to do using jQuery ajax. Javascript in a Firefox extension is not subject to the cross-origin restriction. Here are some tips for using jQuery in a Firefox extension:
Add the jQuery library to your extension's chrome/content/ directory.
-
Load jQuery in the window load event callback rather than including it in your browser overlay XUL. Otherwise it can cause conflicts (e.g. clobbers a user's customized toolbar).
(function(loader){ loader.loadSubScript("chrome://ryebox/content/jquery-1.6.2.min.js"); }) (Components.classes["@mozilla.org/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader));
Use "jQuery" instead of "$". I experienced weird behavior when using $ instead of jQuery (a conflict of some kind I suppose)
Use jQuery(content.document) instead of jQuery(document) to access a page's DOM. In a Firefox extension "document" refers to the browser's XUL whereas "content.document" refers to the page's DOM.
I wrote a Firefox extension for getting bookmarks from my friend's bookmark site. It uses jQuery to fetch my bookmarks in a JSON response from his service, then creates a menu of those bookmarks so that I can easily access them. You can browse the source at https://github.com/erturne/ryebox
Solution 2
For JavaScript in general, the short answer is no, not unless all pages are within the same domain. JavaScript is limited by the same-origin policy, so for security reasons, you cannot do cross-domain requests like that.
However, as pointed out by Max and erturne in the comments, when JavaScript is written as part of an extension/add-on to the browser, the regular rules about same origin policy and cross-domain requests does not seem to apply - at least not for Firefox and Chrome. Therefor, using JavaScript to download the pages should be possible using a XMLHttpRequest, or using some of the wrapper methods included in your favorite JS-library.
If you like me prefer jQuery, you can have a look at jQuery's .load() method, that loads HTML from a given resource, and inject it into an element that you specify.
Edit: Made some updates to my answer based on the comments about cross-domain requests made by add-ons.
Solution 3
You can do XmlHttpRequests (XHR`s) if the combination scheme://domain:port is the same for the page hosting the JavaScript that should fetch the HTML.
Many JS-frameworks gives you easy XHR-support, Jquery, Dojo, etc. Example using DOJO:
function getText() {
dojo.xhrGet({
url: "test/someHtml.html",
load: function(response, ioArgs){
//The repsone is the HTML
return response;
},
error: function(response, ioArgs){
return response;
},
handleAs: "text"
});
}
If you prefer writing your own XMLHttpRequest-handler, take a look here: http://www.w3schools.com/xml/xml_http.asp
B Faley
Updated on June 04, 2022Comments
-
B Faley almost 2 years
Is it possible to download the entire
HTML
of a webpage usingJavaScript
given the URL? What I want to do is to develop a Firefox add-on to download the content of all the links found in the source of current page of browser.update: the URLs reside in the same domain
-
leopic over 12 yearsIS it similar to downthemall.net ?
-
-
Nobita over 12 yearsCouldn't you use XMLHttpRequest to fetch those pages?
-
Christofer Eliasson over 12 years@Nobita Not as long as the resource resides on a different domain. XMLHttpRequest is restricted by the same origin policy. However, it can be used as long as the requests is posted within the same domain.
-
bezmax over 12 years@ChristoferEliasson There is 'firefox-addon' tag in this question, are you sure firefox addon can't request extra rights to load from different domain? Chrome addons have such capability.
-
Christofer Eliasson over 12 years@Max Interesting point, I expected the browser to run all JavaScript's under the same conditions, but I might be wrong there. Let's look in to it.
-
B Faley over 12 yearsYes I want to download the URLs within the same domain. Is it possible to make use of
JQuery
in aFirefox add-on
? -
Christofer Eliasson over 12 years@Meysam jQuery is just plain JavaScript, so I can't see why you shouldn't be able to use within your add-on.
-
erturne over 12 yearsJavaScript in a Firefox extension is not subject to the cross origin restriction.
-
Christofer Eliasson over 12 years@erturne Great to know. Do you have any good references to share?
-
erturne over 12 years@ChristoferEliasson I've never found a good comprehensive source for all the quirky stuff you'll run into when doing Firefox extension development. I've just learned it all by trial and error (e.g. a user called me to complain that my extension messed up their customized toolbar!). You can browse the source for one of my extensions at leapingmind.repositoryhosting.com/trac/leapingmind_ryebox/…