Forward SIGTERM to child in Bash

112,292

Solution 1

Try:

#!/bin/bash 

_term() { 
  echo "Caught SIGTERM signal!" 
  kill -TERM "$child" 2>/dev/null
}

trap _term SIGTERM

echo "Doing some initial work...";
/bin/start/main/server --nodaemon &

child=$! 
wait "$child"

Normally, bash will ignore any signals while a child process is executing. Starting the server with & will background it into the shell's job control system, with $! holding the server's PID (to be used with wait and kill). Calling wait will then wait for the job with the specified PID (the server) to finish, or for any signals to be fired.

When the shell receives SIGTERM (or the server exits independently), the wait call will return (exiting with the server's exit code, or with the signal number + 128 in case a signal was received). Afterward, if the shell received SIGTERM, it will call the _term function specified as the SIGTERM trap handler before exiting (in which we do any cleanup and manually propagate the signal to the server process using kill).

Solution 2

Bash does not forward signals like SIGTERM to processes it is currently waiting on. If you want to end your script by segueing into your server (allowing it to handle signals and anything else, as if you had started the server directly), you should use exec, which will replace the shell with the process being opened:

#!/bin/bash
echo "Doing some initial work....";
exec /bin/start/main/server --nodaemon

If you need to keep the shell around for some reason (ie. you need to do some cleanup after the server terminates), you should use a combination of trap, wait, and kill. See SensorSmith's answer.

Solution 3

Andreas Veithen points out that if you do not need to return from the call (like in the OP's example) simply calling through the exec command is sufficient (@Stuart P. Bentley's answer). Otherwise the "traditional" trap 'kill $CHILDPID' TERM (@cuonglm's answer) is a start, but the wait call actually returns after the trap handler runs which can still be before the child process actually exits. So an "extra" call to wait is advisable (@user1463361's answer).

While this is an improvement it still has a race condition which means that the process may never exit (unless the signaler retries sending the TERM signal). The window of vulnerability is between registering the trap handler and recording the child's PID.

The following eliminates that vulnerability (packaged in functions for reuse).

prep_term()
{
    unset term_child_pid
    unset term_kill_needed
    trap 'handle_term' TERM INT
}

handle_term()
{
    if [ "${term_child_pid}" ]; then
        kill -TERM "${term_child_pid}" 2>/dev/null
    else
        term_kill_needed="yes"
    fi
}

wait_term()
{
    term_child_pid=$!
    if [ "${term_kill_needed}" ]; then
        kill -TERM "${term_child_pid}" 2>/dev/null 
    fi
    wait ${term_child_pid} 2>/dev/null
    trap - TERM INT
    wait ${term_child_pid} 2>/dev/null
}

# EXAMPLE USAGE
prep_term
/bin/something &
wait_term

Solution 4

Provided solution doesn't work for me because process was killed before the wait command actually finished. I found that article http://veithen.github.io/2014/11/16/sigterm-propagation.html, the last snippet work good in my case of application started in the OpenShift with custom sh runner. The sh script is required because I need to have an ability to get thread dumps which is impossible in case PID of Java process is 1.

trap 'kill -TERM $PID' TERM INT
$JAVA_EXECUTABLE $JAVA_ARGS &
PID=$!
wait $PID
trap - TERM INT
wait $PID
EXIT_STATUS=$?
Share:
112,292

Related videos on Youtube

Lorenz
Author by

Lorenz

Updated on September 18, 2022

Comments

  • Lorenz
    Lorenz over 1 year

    I have a Bash script, which looks similar to this:

    #!/bin/bash
    echo "Doing some initial work....";
    /bin/start/main/server --nodaemon
    

    Now if the bash shell running the script receives a SIGTERM signal, it should also send a SIGTERM to the running server (which blocks, so no trap possible). Is that possible?

  • Mathias Begert
    Mathias Begert almost 10 years
    But exec replaces the shell with the given program, I am not clear on why the subsequent wait call is then needed?
  • cuonglm
    cuonglm almost 10 years
    @1_CR: wait need for our script to ... wait for child process to finish. We want to be sure that our script only quit after child process is terminated.
  • Andreas Veithen
    Andreas Veithen over 9 years
    I think that 1_CR's point is valid. Either you simply use exec /bin/start/main/server --nodaemon (in which case the shell process is replaced with the server process and you don't need to propagate any signals) or you use /bin/start/main/server --nodaemon &, but then exec is not really meaningful.
  • Andor
    Andor about 9 years
    @1_CR and @Andreas are correct - I've removed the exec from mid-script, so the Bash process will remain around the server process for cleanup on signal.
  • Andor
    Andor about 9 years
    Also, exec is a perfectly reasonable solution for the problem as the original question was asked; I've submitted it as a separate answer, and clarified what this answer does instead.
  • cuonglm
    cuonglm about 9 years
    @StuartP.Bentley: The point here is using exec command & will start command in a subshell and in new shell exec will replace the shell with the main program. I don't remember when the last edit remove exec part, make my explanation incorrect.
  • LeoRochael
    LeoRochael about 7 years
    If you want your shell script to terminate only after child is terminated, then in the _term() function you should wait "$child" again. This might be necessary if you have some other supervising process waiting for the shell script to die before restarting it again, or if you also trapped EXIT to do some cleanup and neeed it to run only after the child process has finished.
  • Sahil Chaudhary
    Sahil Chaudhary almost 7 years
    I think bash ignores SIGTERM only in interactive mode according to man and my local testing. Otherwise good answer.
  • Alexander Mills
    Alexander Mills over 5 years
    SURELY there is a FLAG we can set in Bash to this for us? For example, set -o forwardsignals, or whatever
  • Andor
    Andor over 5 years
    Excellent job - I've updated the link in my answer to point here (on top of this being a more comprehensive solution, I'm still a little irked that the StackExchange UI doesn't credit me in cuonglm's answer for fixing the script to actually do what it's supposed to and writing pretty much all the explanatory text after the OP who didn't even understand made a few minor re-edits).
  • Andor
    Andor over 5 years
    @AlexanderMills Read the other answers. Either you're looking for exec, or you want to set up traps.
  • SensorSmith
    SensorSmith over 5 years
    @StuartP.Bentley, thanks. I was surprised assembling this required two (not accepted) answers and an external reference, and then I had to run down the race condition. I will upgrade my references to links as what little additional kudos I can give.
  • Alexander Mills
    Alexander Mills over 5 years
    thanks @StuartP.Bentley I missed that until you mentioned it
  • Igor Bukanov
    Igor Bukanov over 5 years
    The script as it is written is racy. If TERM will be sent righter after bash started the background job but before child=$! , the child is not set and the kill reports error. To fix the trap handler should just use $!
  • SensorSmith
    SensorSmith about 5 years
    @IgorBukanov and what happens if TERM is sent right BEFORE bash starts the background job? See my answer for a full solution.
  • Torsten Bronger
    Torsten Bronger over 4 years
    Is this solution Bash-only?
  • SensorSmith
    SensorSmith over 4 years
    @TorstenBronger it should be portable, but I haven't tested it under anything but Bash. I did not use any deliberate Bashisms (no 'function' keyword, no double brace conditionals, no fancy tricks in the output redirection, and the trap syntax is Posix).
  • Torsten Bronger
    Torsten Bronger over 3 years
    However, this only works if the child is guaranteed not to exit (on purpose or on error). Then, the second wait throws an error.
  • SensorSmith
    SensorSmith over 3 years
    @TorstenBronger re-tested under Ubuntu 18.04 Bash 4.4.20 (not my original target) and get Bash debug-ish output with line # and "Terminated", but when the child had NOT exited prior to the trap (odd). It might be legal for the PID to be forgotten after the first wait, but the 2nd wait IS necessary on some systems, so no good answer. (Exit code was still available in this test.) I edited to redirect the "error" output to null for when/systems on which it happens.
  • Torsten Bronger
    Torsten Bronger over 3 years
    I saw no other way but to assume that the child never exits with 143 or 130 (unless signal was sent). I documented it cleanly, and recommended to wrap it in a subshell if it does.
  • Torsten Bronger
    Torsten Bronger over 3 years
    At gist.github.com/bronger/… you see what was necessary in my case (zsh). It still does not cover all edge cases, but they might be considered programming errors anyway.