wait child process but get error: 'pid is not a child of this shell'

26,622

Solution 1

Just find the process id of the process you want to wait for and replace that with 12345 in below script. Further changes can be made as per your requirement.

#!/bin/sh
PID=12345
while [ -e /proc/$PID ]
do
    echo "Process: $PID is still running" >> /home/parv/waitAndRun.log
    sleep .6
done
echo "Process $PID has finished" >> /home/parv/waitAndRun.log

/usr/bin/waitingScript.sh

http://iamparv.blogspot.in/2013/10/unix-wait-for-running-process-not-child.html

Solution 2

Either your while loop or the for loop runs in a subshell, which is why you cannot await a child of the (parent, outer) shell.

Edit this might happen if the while loop or for loop is actually

(a) in a {...} block (b) participating in a piper (e.g. for....done|somepipe)

Solution 3

If you're running this in a container of some sort, the condition apparently can be caused by a bug in bash that is easier to encounter in a containerized envrionment.

From my reading of the bash source (specifically see comments around RECYCLES_PIDS and CHILD_MAX in bash-4.2/jobs.c), it looks like in their effort to optimize their tracking of background jobs, they leave themselves vulnerable to PID aliasing (where a new process might obscure the status of an old one); to mitigate that, they prune their background process history (apparently as mandated by POSIX?). If you should happen to want to wait on a pruned process, the shell can't find it in the history and assumes this to mean that it never knew about it (i.e., that it "is not a child of this shell").

Share:
26,622

Related videos on Youtube

henshao
Author by

henshao

Updated on July 04, 2020

Comments

  • henshao
    henshao almost 4 years

    I write a script to get data from HDFS parrallel,then I wait these child processes in a for loop, but sometimes it returns "pid is not a child of this shell". sometimes, it works well。It's so puzzled. I use "jobs -l" to show all the jobs run in the background. I am sure these pid is the child process of the shell process, and I use "ps aux" to make sure these pids is note assign to other process. Here is my script.

    PID=()
    FILE=()
    let serial=0
    
    while read index_tar
    do
            echo $index_tar | grep index > /dev/null 2>&1
    
            if [[ $? -ne 0 ]]
            then
                    continue
            fi
    
            suffix=`printf '%03d' $serial`
            mkdir input/output_$suffix
            $HADOOP_HOME/bin/hadoop fs -cat $index_tar | tar zxf - -C input/output_$suffix \
                    && mv input/output_$suffix/index_* input/output_$suffix/index &
    
            PID[$serial]=$!
            FILE[$serial]=$index_tar
    
            let serial++
    
    done < file.list
    
    for((i=0;i<$serial;i++))
    do
            wait ${PID[$i]}
    
            if [[ $? -ne 0 ]]
            then
                    LOG "get ${FILE[$i]} failed, PID:${PID[$i]}"
                    exit -1
            else
                    LOG "get ${FILE[$i]} success, PID:${PID[$i]}"
            fi
    done
    
    • Kemin Zhou
      Kemin Zhou over 5 years
      A good question, I am getting exactly the same error. I launched 96 background jobs and waited for them. 4 of the 96 gave me the "pid 28991 (this number is the random child PID as an example) is not a child of this shell". I assume that the wait command is not foolproof. I will do some digging.
  • sehe
    sehe over 12 years
    You could check this line of thinking, nonetheless, e.g. printing $BASHPID, $$, $BASH_SUBSHELL in both locations (and at the toplevel of your script!)
  • Viet
    Viet over 6 years
    Smart trick! One can wait like this and then launch scripts after the process is finished.
  • Alexander Mills
    Alexander Mills about 6 years
    not so good, if every computer program used polling like this, it would be a very bad thing :)
  • avisheks
    avisheks about 6 years
    I'll not be able to get the exit code of child process
  • anarcat
    anarcat over 5 years
    neat. i can't believe there isn't a standard tool to do this anywhere... i've made a simpler shell script called waitpid that's basically this one-liner: while [ -e /proc/$1 ]; do sleep 1; done...
  • Clete2
    Clete2 about 2 years
    I think I'm running into this issue. Do you know if it will be fixed at all?
  • jhfrontz
    jhfrontz about 2 years
    @Clete2 it's apparently been this way for 8+ years and the [exacerbating] behavior is seemingly at least partially mandated by POSIX compliance. I wouldn't expect it to change anytime soon.