Multithreaded downloading with shell script
8,520
Solution 1
Have a look at man xargs
:
-P max-procs --max-procs=max-procs
Run up to max-procs processes at a time; the default is 1. If max-procs is 0, xargs will run as many processes as possible at a time.
Solution:
xargs -P 20 -n 1 wget -nv <urs.txt
Solution 2
If you just want to grab each URL(regardless of number) then the answer is easy:
#!/bin/bash
URL_LIST="http://url1/ http://url2/"
for url in $URL_LIST ; do
wget ${url} & >/dev/null
done
If you want to only create a limited number of pulls, say 10. Then you would do something like this:
#!/bin/bash
URL_LIST="http://url1/ http://url2/"
function download() {
touch /tmp/dl-${1}.lck
wget ${url} >/dev/null
rm -f /tmp/dl-${1}.lck
}
for url in $URL_LIST ; do
while [ 1 ] ; do
iter=0
while [ $iter -lt 10 ] ; do
if [ ! -f /tmp/dl-${iter}.lck ] ; then
download $iter &
break 2
fi
let iter++
done
sleep 10s
done
done
Do note I haven't actually tested it, but just banged it out in 15 minutes. but you should get a general idea.
Solution 3
You could use something like puf which is designed for that sort of thing, or you could use wget/curl/lynx in combination with GNU parallel.
Related videos on Youtube
Author by
synapse
Updated on September 17, 2022Comments
-
synapse over 1 year
Let's say I have a file with lots of URLs and I want to download them in parallel using arbitrary number of processes. How can I do it with bash?
-
Richard June about 13 yearsOh, that's very slick. did not know about -P
-
Gordon Davisson about 13 yearsIn case the original link vanishes, the recommended command (with useless use of cat removed) is:
xargs -P 20 -n 1 wget -nv <urs.txt
-
Latheeshwar Raj about 13 yearsOh, also, unless you have separate ISPs, or bandwidth limitations or something, you USUALLY are not going to have any total faster download speed, by doing it in parallel
-
Ole Tange about 13 yearsWhich would looke like this: cat urlfile | parallel -j50 wget