How to `wget` a list of URLs in a text file?

text wget

148,702

Solution 1

Quick man wget gives me the following:

[..]

-i file

--input-file=file

Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.)

If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. If --force-html is not specified, then file should consist of a series of URLs, one per line.

[..]

So: wget -i text_file.txt

Solution 2

try:

wget -i text_file.txt

(check man wget)

Solution 3

If you also want to preserve the original file name, try with:

wget --content-disposition --trust-server-names -i list_of_urls.txt

Solution 4

Run it in parallel with

cat text_file.txt | parallel --gnu "wget {}"

Solution 5

If you're on OpenWrt or using some old version of wget which doesn't gives you -i option:

#!/bin/bash
input="text_file.txt"
while IFS= read -r line
do
  wget $line
done < "$input"

Furthermore, if you don't have wget, you can use curl or whatever you use for downloading individual files.

View more solutions

148,702

ShanZhengYang

Updated on July 08, 2022

Comments

ShanZhengYang 6 months
Let's say I have a text file of hundreds of URLs in one location, e.g.
```
http://url/file_to_download1.gz
http://url/file_to_download2.gz
http://url/file_to_download3.gz
http://url/file_to_download4.gz
http://url/file_to_download5.gz
....
```
What is the correct way to download each of these files with wget? I suspect there's a command like wget -flag -flag text_file.txt
- Dave about 5 years
  
  Anybody end up here after trying to get US topos at nationalmap.gov?
- barlop over 2 years
  
  Besides wget -i, You'll want to add some switches so you don't get banned from the servers for hammering them! And so that if it can't download one it doesn't keep trying for too long -w and -t and -T may be of interest
becko about 5 years

Is there a way to control the number of concurrent jobs?
Ricardo over 2 years

Check the answer below by @Yusef: cat text_file.txt | parallel --gnu "wget {}"
Ahmed Fasih 9 months

If Parallel's demand for citation is annoying, use xargs: cat text_file.txt | xargs -n10 -P4 wget. This tells xargs to call wget with 10 URLs and run 4 wget processes at a time. For a little bit nicer experience, here's what I do: cat text_file.txt | shuf | xargs -n10 -P4 wget --continue. This (1) shuffles the URLs so when you stop and restart, it's more likely to start downloading new files right away, and (2) it asks wget to continue partial downloads (you might get some if you Control-C while wget is downloading).