How to get the contents of a webpage in a shell variable?

323,242

Solution 1

You can use wget command to download the page and read it into a variable as:

content=$(wget google.com -q -O -)
echo $content

We use the -O option of wget which allows us to specify the name of the file into which wget dumps the page contents. We specify - to get the dump onto standard output and collect that into the variable content. You can add the -q quiet option to turn off's wget output.

You can use the curl command for this aswell as:

content=$(curl -L google.com)
echo $content

We need to use the -L option as the page we are requesting might have moved. In which case we need to get the page from the new location. The -L or --location option helps us with this.

Solution 2

There are many ways to get a page from the command line... but it also depends if you want the code source or the page itself:

If you need the code source:

with curl:

curl $url

with wget:

wget -O - $url

but if you want to get what you can see with a browser, lynx can be useful:

lynx -dump $url

I think you can find so many solutions for this little problem, maybe you should read all man pages for those commands. And don't forget to replace $url by your URL :)

Good luck :)

Solution 3

There is the wget command or the curl.

You can now use the file you downloaded with wget. Or you can handle a stream with curl.


Resources :

Solution 4

You can use curl or wget to retrieve the raw data, or you can use w3m -dump to have a nice text representation of a web page.

$ foo=$(w3m -dump http://www.example.com/); echo $foo
You have reached this web page by typing "example.com", "example.net","example.org" or "example.edu" into your web browser. These domain names are reserved for use in documentation and are not available for registration. See RFC 2606, Section 3.

Solution 5

content=`wget -O - $url`
Share:
323,242
Aillyn
Author by

Aillyn

http://xkcd.com/162/

Updated on July 21, 2022

Comments

  • Aillyn
    Aillyn almost 2 years

    In Linux how can I fetch an URL and get its contents in a variable in shell script?

  • Jim Lewis
    Jim Lewis over 13 years
    @rjack: (But the article you linked to does make a pretty good case for the $(...) syntax.)
  • Dennis
    Dennis almost 12 years
    This is a really neat trick. I invoke a shell script via a php script on a proxy server. When asked, the proxy server turns on expensive servers which shut themselves off after 2 hours. I need the output from wget for standard output to feed back to the Jenkins console record.
  • juggernauthk108
    juggernauthk108 over 7 years
    i am yet to get this... can anybody demostrate as to how , for eg. get an img tag in a variable for this link www2.watchop.io/manga2/read/one-piece/1/4 ??
  • pyrocrasty
    pyrocrasty over 7 years
    @juggernaut1996: that should be a separate question. Briefly, you have to download the page, extract the src attribute of the correct element, then download that page. If you install tq, this command should do it: curl -s http://ww1.watchop.io/manga2/read/one-piece/1/4 | tq -j -a src "#imgholder a img" | xargs wget
  • Prasad Bonthu
    Prasad Bonthu almost 6 years
    Wget 1.14 version is not accepting convert_links = on with -O- option. It is failing with error -k can be used together with -O only if outputting to a regular file.. Is it expected?
  • Admin
    Admin about 3 years
    If I were you, I'd double quote the url's.