Bash wait for jobs and limit job count
5,727
Solution 1
You can remember the PID of each new child (check $!
after starting it). Periodically check how many children still exist (e.g. by kill -0
), if the number goes down, spawn a new one, etc. At the end, just wait
.
Here is a script I wrote for the same reason:
#! /bin/bash
## Tries to run commands in parallel. Commands are read from STDIN one
## per line, or from a given file specified by -f.
## Author: E. Choroba
file='-'
proc_num=$(grep -c ^processor'\b' /proc/cpuinfo)
prefix=$HOSTNAME-$USER-$$
sleep=10
children=()
names=()
if [[ $1 =~ ^--?h(elp)?$ ]] ; then
cat <<-HELP
Usage: ${0##*/} [-f file] [-n max-processes] [-p tmp-prefix] -s [sleep]
Defaults:
STDIN for file
$proc_num for max-processes (number of processors)
$prefix for tmp-prefix
$sleep for sleep interval
HELP
exit
fi
function debug () {
if ((DEBUG)) ; then
echo "$@" >&2
fi
}
function child_count () {
debug Entering child_count "${children[@]}"
child_count=0
new_children=()
for child in "${children[@]}" ; do
debug Trying $child
if kill -0 $child 2>/dev/null ; then
debug ... exists
let child_count++
new_children+=($child)
fi
done
children=("${new_children[@]}")
echo $child_count
debug Leaving child_count "${children[@]}"
}
while getopts 'f:n:p:s:' arg ; do
case $arg in
f ) file=$OPTARG ;;
n ) proc_num=$((OPTARG)) ;;
p ) prefix=$OPTARG;;
s ) sleep=$OPTARG;;
* ) echo "Warning: unknown option $arg" >&2 ;;
esac
done
i=0
while read -r line ; do
debug Reading $line
name=$prefix.$i
let i++
names+=($name)
while ((`child_count`>=proc_num)) ; do
sleep $sleep
debug Sleeping
done
eval $line 2>$name.e >$name.o &
children+=($!)
debug Running "${children[@]}"
done < <(cat $file)
debug Loop ended
wait
cat "${names[@]/%/.o}"
cat "${names[@]/%/.e}" >&2
rm "${names[@]/%/.o}" "${names[@]/%/.e}"
Solution 2
From the linked question, tailored to your variation:
sed -n -e '/#/!s,\\,/,g' files.m3u | xargs -d '\n' -I {} -P 4 \
sh -c 'line=$1; file=${line##*/}; avconv -i "$line" "${file%.*}.wav"' avconv_sh {}
Again, GNU xargs
or some version supporting -d
and -P
is required. Also beware of extra spaces in the input file at the beginning and end of the line - this snippet will keep them in if they exist, which may cause problems.
Related videos on Youtube
Author by
Sky
Updated on September 18, 2022Comments
-
Sky over 1 year
Possible Duplicate:
Four tasks in parallel… how do I do that?Suppose a loop invoking a command
grep -v '#' < files.m3u | sed 's/\\\\/\/\//g' | sed 's/\\/\//g' | while read line do filename=$(basename "$line") avconv -i "$line" "${filename%.*}.wav" done
Putting & after avconv will keep spawning avconv for each file. Now I want to do two things:
- I want to limit the number of processes spawned to 4
- When the loop is done I want to wait for the last one to be ready
-
choroba over 11 yearsYour solution is far from optimal. While you are waiting for job 1, jobs 2, 3 and 4 might have finished...
-
jw013 over 11 yearsBeware the race conditions that come with the approach of "checking if a process is alive by PID". There is no good way to detect the case where the process exits and its PID gets recycled for a new process. A lock file in a secure directory that the child removes on exit would be more robust. In the end, I'd just say don't reinvent the wheel (unless your wheel is really better).
-
Sky over 11 yearsYes, but the will be approximatly equal in size so it should not differ that much in my case.
-
Alessio over 11 yearsyes, xargs' parallel execution option is perfect for this. it's (unfortunately) a little-known feature of GNU xargs. Here are some useful links describing the feature: tummy.com/journals/entries/jafo_20100418_235041 and spinellis.gr/blog/20090304/index.html
-
Alessio over 11 yearsGNU
parallel
(gnu.org/software/parallel) is also a useful tool for this kind of job.