How to wait on all child (and grandchild etc) process spawned by a script

13,876

Solution 1

You can use wait to wait for all the background processes started by userscript to complete. Since wait only works on children of the current shell, you'll need to source their script instead of running it as a separate process.

( source userscript; wait )

Sourcing the script in an explicit subshell should simulate starting a new process closely enough. If not, you can also background the subshell, which forces a new process to be started, then wait for it to complete.

( source userscript; wait ) & wait

Solution 2

ps --ppid $PID will list all child processes of the process with $PID.

Solution 3

You can open a file descriptor that gets inherited by other processes, and then wait until it's no longer in use. This is a low overhead method that usually works fine, though it's possible for processes to work around it if they want:

 foo=$(mktemp)
 ( flock -x 5000; theirscript; ) 5000> "$foo"
 flock -x 0 < "$foo"
 rm "$foo"
 echo "The script and its subprocesses are done"

You can follow all invoked processes using ptrace, such as with strace. This is easier, but has some associated overhead and may not work when scripts invoke suid binaries:

strace -f -e none theirscript

Solution 4

You can use pgrep -P <parent_pid> to get a list of child processes. Example:

IFS=$'\n' read -ra CHILD_PROCS -d '' < <(exec pgrep -P "$1")

And to get the grand-children, simply do the same procedure on each child process.

Check out my blog Bash functions to list and kill or send signals to process trees.

You can use one of those function to properly list all processes spawned under one process. Each has their own method or order of sending signals to process.

The only limitation by those is that process still have to be connected and not orphaned. If you could somehow find a way to group your processes, then that might be your solution.

Solution 5

To simply answer the question that was asked. You could store the process ID of each script you're calling into the same variable:

echo "START"
first_program &
child_process_ids+="$! "
second_program &
child_process_ids+="$! "
echo $child_process_ids
echo "DONE"

$child_process_ids would just be a space delimited string of process Ids. Now, this answers the question asked, however, what I would do would be a bit different. I would call each script from a for loop, store its process ID, then wait on each one in another for loop to finish and inspect each exit code individually. Using the same example, here's what it would look like.

echo "START"

scripts="first_program second_program"

for script in $scripts; do
    #Call script and send to background
    ./$script &

    #Store the script's processID that was just sent to the background
    child_process_ids+="$! "
done

for child_process_id in $child_process_ids; do

    #Pass each processId into the wait command to retrieve its exit 
    #code and store it in $rc
    wait $child_process_id
    rc=$?

    #Inspect each processes exit code
    if [ $rc -ne 0 ]; then
        echo "$child_process_id failed with an exit code of $rc"
    else
        echo "$child_process_id was successful"
    fi
done
Share:
13,876
Ram
Author by

Ram

pro-g-Ram-er

Updated on July 29, 2022

Comments

  • Ram
    Ram almost 2 years

    Context:

    Users provide me their custom scripts to run. These scripts can be of any sort like scripts to start multiple GUI programs, backend services. I have no control over how the scripts are written. These scripts can be of blocking type i.e. execution waits till all the child processes (programs that are run sequentially) exit

    #exaple of blocking script
    echo "START"
    first_program 
    second_program 
    echo "DONE"
    

    or non blocking type i.e. ones that fork child process in the background and exit something like

    #example of non-blocking script
    echo "START"
    first_program &
    second_program &
    echo "DONE"
    

    What am I trying to achieve?

    User provided scripts can be of any of the above two types or mix of both. My job is to run the script and wait till all the processes started by it exit and then shutdown the node. If its of blocking type, case is plain simple i.e. get the PID of script execution process and wait till ps -ef|grep -ef PID has no more entries. Non-blocking scripts are the ones giving me trouble

    Is there a way I can get list of PIDs of all the child process spawned by execution of a script? Any pointers or hints will be highly appreciated

  • Ram
    Ram over 10 years
    as soon as the main script exits(in micro seconds because it just need to start two process in the background), child processes will be assigned to INIT as parent. No?
  • Ansgar Wiechers
    Ansgar Wiechers over 10 years
    Yes. If the parent is going to exit, you need to create the list of child PIDs before it does, e.g. in a file.
  • Ram
    Ram over 10 years
    Worked like charm - Thanks @chepner
  • jmiserez
    jmiserez over 8 years
    It would be helpful if you could add the relevant pieces of code to this answer (provided you have the right to do so).
  • konsolebox
    konsolebox over 8 years
    @sideshowbarker Ok I made an update. I'm pretty sure the basic idea is enough to make this a valid answer. Or else the other answer here should also be made invalid.
  • gniourf_gniourf
    gniourf_gniourf over 8 years
    In the second snippet, do you really need a subshell? wouldn't { source userscript; wait; } & wait be enough? (& backgrounds the group…)
  • chepner
    chepner over 8 years
    The command group would probably be fine. I don't recall putting too much thought into it other than to background the first snippet.