wget or curl from stdin

14,139

Solution 1

What you need to use is xargs. E.g.

tail -f 1.log | xargs -n1 wget -O - -q

Solution 2

Use xargs which converts stdin to argument.

tail 1.log | xargs -L 1 wget
Share:
14,139
maximdim
Author by

maximdim

Yep

Updated on June 28, 2022

Comments

  • maximdim
    maximdim about 2 years

    I'd like to download a web pages while supplying URLs from stdin. Essentially one process continuously produces URLs to stdout/file and I want to pipe them to wget or curl. (Think about it as simple web crawler if you want).

    This seems to work fine:

    tail 1.log | wget -i - -O - -q 
    

    But when I use 'tail -f' and it doesn't work anymore (buffering or wget is waiting for EOF?):

    tail -f 1.log | wget -i - -O - -q
    

    Could anybody provide a solution using wget, curl or any other standard Unix tool? Ideally I don't won't want to restart wget in the loop, just keep it running downloading URLs as they come.

  • pabouk - Ukraine stay strong
    pabouk - Ukraine stay strong almost 11 years
    With xargs wget receives the URL as a parameter so you do not need -i - anymore. tail -f 1.log | xargs -n1 wget -O - -q
  • Neil McGuigan
    Neil McGuigan almost 8 years
    this will start a new wget process per URL
  • Silas S. Brown
    Silas S. Brown over 6 years
    If this is running on a shared machine, you might like to know that any other user can read your parameters using the "ps" command, so don't put passwords etc in your URLs. Use one of the solutions that does not involve turning stdin into parameters if this might be a problem (admins with root access to the machine could of course still check which URLs you're fetching, but presumably you trust the admins more than you trust random other users).
  • Silas S. Brown
    Silas S. Brown over 6 years
    As I commented on the other answer: if this is running on a shared machine, you might like to know that any other user can read your parameters using the "ps" command, so don't put passwords etc in your URLs. Use one of the solutions that does not involve turning stdin into parameters if this might be a problem (admins with root access to the machine could of course still check which URLs you're fetching, but presumably you trust the admins more than you trust random other users).