Default exit code when process is terminated?

87,969

Solution 1

Processes can call the _exit() system call (on Linux, see also exit_group()) with an integer argument to report an exit code to their parent. Though it's an integer, only the 8 least significant bits are available to the parent (exception to that is when using waitid() or handler on SIGCHLD in the parent to retrieve that code, though not on Linux).

The parent will typically do a wait() or waitpid() to get the status of their child as an integer (though waitid() with somewhat different semantics can be used as well).

On Linux and most Unices, if the process terminated normally, bits 8 to 15 of that status number will contain the exit code as passed to exit(). If not, then the 7 least significant bits (0 to 6) will contain the signal number and bit 7 will be set if a core was dumped.

perl's $? for instance contains that number as set by waitpid():

$ perl -e 'system q(kill $$); printf "%04x\n", $?'
000f # killed by signal 15
$ perl -e 'system q(kill -ILL $$); printf "%04x\n", $?'
0084 # killed by signal 4 and core dumped
$ perl -e 'system q(exit $((0xabc))); printf "%04x\n", $?'
bc00 # terminated normally, 0xbc the lowest 8 bits of the status

Bourne-like shells also make the exit status of the last run command in their own $? variable. However, it does not contain directly the number returned by waitpid(), but a transformation on it, and it's different between shells.

What's common between all shells is that $? contains the lowest 8 bits of the exit code (the number passed to exit()) if the process terminated normally.

Where it differs is when the process is terminated by a signal. In all cases, and that's required by POSIX, the number will be greater than 128. POSIX doesn't specify what the value may be. In practice though, in all Bourne-like shells that I know, the lowest 7 bits of $? will contain the signal number. But, where n is the signal number,

  • in ash, zsh, pdksh, bash, the Bourne shell, $? is 128 + n. What that means is that in those shells, if you get a $? of 129, you don't know whether it's because the process exited with exit(129) or whether it was killed by the signal 1 (HUP on most systems). But the rationale is that shells, when they do exit themselves, by default return the exit status of the last exited command. By making sure $? is never greater than 255, that allows to have a consistent exit status:

    $ bash -c 'sh -c "kill \$\$"; printf "%x\n" "$?"'
    bash: line 1: 16720 Terminated              sh -c "kill \$\$"
    8f # 128 + 15
    $ bash -c 'sh -c "kill \$\$"; exit'; printf '%x\n' "$?"
    bash: line 1: 16726 Terminated              sh -c "kill \$\$"
    8f # here that 0x8f is from a exit(143) done by bash. Though it's
       # not from a killed process, that does tell us that probably
       # something was killed by a SIGTERM
    
  • ksh93, $? is 256 + n. That means that from a value of $? you can differentiate between a killed and non-killed process. Newer versions of ksh, upon exit, if $? was greater than 255, kills itself with the same signal in order to be able to report the same exit status to its parent. While that sounds like a good idea, that means that ksh will generate an extra core dump (potentially overwriting the other one) if the process was killed by a core generating signal:

    $ ksh -c 'sh -c "kill \$\$"; printf "%x\n" "$?"'
    ksh: 16828: Terminated
    10f # 256 + 15
    $ ksh -c 'sh -c "kill -ILL \$\$"; exit'; printf '%x\n' "$?"
    ksh: 16816: Illegal instruction(coredump)
    Illegal instruction(coredump)
    104 # 256 + 15, ksh did indeed kill itself so as to report the same
        # exit status as sh. Older versions of `ksh93` would have returned
        # 4 instead.
    

    Where you could even say there's a bug is that ksh93 kills itself even if $? comes from a return 257 done by a function:

    $ ksh -c 'f() { return "$1"; }; f 257; exit'
    zsh: hangup     ksh -c 'f() { return "$1"; }; f 257; exit'
    # ksh kills itself with a SIGHUP so as to report a 257 exit status
    # to its parent
    
  • yash. yash offers a compromise. It returns 256 + 128 + n. That means we can also differentiate between a killed process and one that terminated properly. And upon exiting, it will report 128 + n without having to suicide itself and the side effects it can have.

    $ yash -c 'sh -c "kill \$\$"; printf "%x\n" "$?"'
    18f # 256 + 128 + 15
    $ yash -c 'sh -c "kill \$\$"; exit'; printf '%x\n' "$?"
    8f  # that's from a exit(143), yash was not killed
    

To get the signal from the value of $?, the portable way is to use kill -l:

$ /bin/kill 0
Terminated
$ kill -l "$?"
TERM

(for portability, you should never use signal numbers, only signal names)

On the non-Bourne fronts:

  • csh/tcsh and fish same as the Bourne shell except that the status is in $status instead of $? (note that zsh also sets $status for compatibility with csh (in addition to $?)).
  • rc: the exit status is in $status as well, but when killed by a signal, that variable contains the name of the signal (like sigterm or sigill+core if a core was generated) instead of a number, which is yet another proof of the good design of that shell.
  • es. the exit status is not a variable. If you care for it, you run the command as:

    status = <={cmd}
    

    which will return a number or sigterm or sigsegv+core like in rc.

Maybe for completeness, we should mention zsh's $pipestatus and bash's $PIPESTATUS arrays that contain the exit status of the components of the last pipeline.

And also for completeness, when it comes to shell functions and sourced files, by default functions return with the exit status of the last command run, but can also set a return status explicitly with the return builtin. And we see some differences here:

  • bash and mksh (since R41, a regression^Wchange apparently introduced intentionally) will truncate the number (positive or negative) to 8 bits. So for instance return 1234 will set $? to 210, return -- -1 will set $? to 255.
  • zsh and pdksh (and derivatives other than mksh) allow any signed 32 bit decimal integer (-231 to 231-1) (and truncate the number to 32bits).
  • ash and yash allow any positive integer from 0 to 231-1 and return an error for any number out of that.
  • ksh93 for return 0 to return 320 set $? as is, but for anything else, truncate to 8 bits. Beware as already mentioned that returning a number between 256 and 320 could cause ksh to kill itself upon exit.
  • rc and es allow returning anything even lists.

Also note that some shells also use special values of $?/$status to report some error conditions that are not the exit status of a process, like 127 or 126 for command not found or not executable (or syntax error in a sourced file)...

Solution 2

When a process exits, it returns an integer value to the operating system. On most unix variants, this value is taken modulo 256: everything but the low-order bits is ignored. The status of a child process is returned to its parent through a 16-bit integer in which

  • bits 0–6 (the 7 low-order bits) are the signal number that was used to kill the process, or 0 if the process exited normally;
  • bit 7 is set if the process was killed by a signal and dumped core;
  • bits 8–15 are the process's exit code if the process exited normally, or 0 if the process was killed by a signal.

The status is returned by the wait system call or one of its siblings. POSIX does not specify the exact encoding of the exit status and signal number; it only provides

  • a way to tell whether the exit status corresponds to a signal or to a normal exit;
  • a way to access the exit code, if the process exited normally;
  • a way to access the signal number, if the process was killed by a signal.

Strictly speaking, there's no exit code when a process is killed by a signal: what there is instead is an exit status.

In a shell script, the exit status of a command is reported via the special variable $?. This variable encodes the exit status in an ambiguous way:

  • If the process exited normally then $? is its exit status.
  • If the process was killed by a signal then $? is 128 plus the signal number on most systems. POSIX only mandates that $? is greater than 128 in this case; ksh93 adds 256 instead of 128. I've never seen a unix variant that did anything other than add a constant to the signal number.

Thus in a shell script you cannot tell conclusively whether a command was killed by a signal or exited with a status code greater than 128, except with ksh93. It is very rare for programs to exit with status codes greater than 128, in part because programmers avoid it due to the $? ambiguity.

SIGINT is signal 2 on most unix variants, thus $? is 128+2=130 for a process that was killed by SIGINT. You'll see 129 for SIGHUP, 137 for SIGKILL, etc.

Solution 3

That depends on your shell. From the bash(1) man page, SHELL GRAMMAR section, Simple Commands subsection:

The return value of a simple command is [...] 128+n if the command is terminated by signal n.

Since SIGINT on your system is signal number 2, the return value is 130 when it is run under Bash.

Solution 4

It seems to be the right place to mention that SVr4 introduced waitid() in 1989, but no important program seems to use it so far. waitid() allows to retrieve the full 32 bits from the exit() code.

About 2 months ago, I rewrote the wait/job control part of the Bourne Shell to use waitid() instead of waitpid(). This was done in order to remove the limitation that masks the exit code with 0xFF.

The waitid() interface is much cleaner that previous wait() implementations except for the cwait() call from UNOS from 1980.

You may be interested to read the man page at:

http://schillix.sourceforge.net/man/man1/bosh.1.html

and check the section "Parameter Substitution" currently staring at page 8.

The new variables .sh.* have been introduced for the waitid() interface. This interface no longer has ambiguous meanings for the numbers known for $? and make interfacing much easier.

Note that you need to have a POSIX compliant waitid() to be able to use this feature, so Mac OS X and Linux currently don't offer this, but the waitid() is emulated on the waitpid() call, so on a non-POSIX platform you will still only get 8 bits from the exit code.

In short: .sh.status is the numerical exit code, .sh.code is the numerical exit reason.

For better portability, there is: .sh.codename for the textual version of the exit reason, e.g. "DUMPED" and .sh.termsig, the singal name for the signal that terminated the process.

For better usage, there are two non-exit-related .sh.codename values: "NOEXEC" and "NOTFOUND" that are used when a program cannot be launched at all.

FreeBSD fixed their waitid() kerlnel bug within 20 hours after my report, Linux did not yet start with their fix. I hope that 26 years after introducing this feature that is in POSIX now, all OS will correclty support it soon.

Share:
87,969

Related videos on Youtube

Cory Klein
Author by

Cory Klein

Updated on September 18, 2022

Comments

  • Cory Klein
    Cory Klein almost 2 years

    When a process is killed with a handle-able signal like SIGINT or SIGTERM but it does not handle the signal, what will be the exit code of the process?

    What about for unhandle-able signals like SIGKILL?

    From what I can tell, killing a process with SIGINT likely results in exit code 130, but would that vary by kernel or shell implementation?

    $ cat myScript
    #!/bin/bash
    sleep 5
    $ ./myScript
    <ctrl-c here>
    $ echo $?
    130
    

    I'm not sure how I would test the other signals...

    $ ./myScript &
    $ killall myScript
    $ echo $?
    0  # duh, that's the exit code of killall
    $ killall -9 myScript
    $ echo $?
    0  # same problem
    
    • Admin
      Admin over 10 years
      your killall myScript works, hence the return of the killall (and not of the script!) is 0. You could place a kill -x $$ [x being the signal number, and $$ usually expanded by the shell to that script's PID (works in sh, bash, ...)] inside the script and then test what was its exit core.
    • Admin
      Admin almost 9 years
    • Admin
      Admin about 7 years
      comment about the semi-question: Don't put myScript in the background. (omit &). Send the signal from another shell process (in another terminal), then you can use $? after myScript has ended.
  • Cory Klein
    Cory Klein over 10 years
    How in the world do you find this, or even know where to look? I bow before your genius.
  • Ignacio Vazquez-Abrams
    Ignacio Vazquez-Abrams over 10 years
    @CoryKlein: Experience, mostly. Oh, and you'll likely want the signal(7) man page as well.
  • Stéphane Chazelas
    Stéphane Chazelas over 10 years
    A lot better worded and more to the point than mine even if it says in essence the same things. You may want to clarify that $? is for Bourne-like shells only. See also yash for a different (but still POSIX) behaviour. Also as per POSIX+XSI (Unix), a kill -2 "$pid" will send a SIGINT to the process, but the actual signal number may not be 2, so $? will not not necessarily be 128+2 (or 256+2 or 384+2), though kill -l "$?" will return INT, which is why I would advise for portability not to refer to the numbers themselves.
  • n611x007
    n611x007 about 10 years
    an exit code to their parent and to get the *status* of their child. you've added emphasis on "status". Is exit code and *status* the same? Case yes, what's the origin of having two names? Case not same, could you give definition/reference of status?
  • Stéphane Chazelas
    Stéphane Chazelas about 10 years
    There are 3 numbers here. The exit code: the number passed to exit(). The exit status: the number obtained by waitpid() which includes the exit code, signal number and whether there was a core dumped. And the number that some shells make available in one of their special variables ($?, $status) that is a transformation of the exit status in such a way that is does contain the exit code in case there was a normal termination, but also carries signal information if the process was killed (that one is also generally called exit status). That is all explained in my answer.
  • n611x007
    n611x007 about 10 years
    I see thank you! I definitely appreciate this explicit note of the distinction here. These expressions regarding the exit are used so interchangeably at some places it is worth making it. Does the shell variable variant even has a (general) name? So I'd suggest clearing it up explicitly before going into details on the shells. I'd suggest inserting the explanation (from your comment) after your first or second paragraph.
  • Ciro Santilli Путлер Капут 六四事
    Ciro Santilli Путлер Капут 六四事 almost 9 years
    @StéphaneChazelas thanks for confirming. Agree that it is widely followed, and also hope that POSIX will standardize it. Just checking ;-)
  • Rui F Ribeiro
    Rui F Ribeiro over 6 years
    cool stuff; do you know if I have include files in C with those constants by chance? +1
  • Rui F Ribeiro
    Rui F Ribeiro over 6 years
    @CoryKlein Why have you not select this as the correct answer?
  • JdeBP
    JdeBP almost 6 years
    A related answer is unix.stackexchange.com/a/453432/5132 .
  • cuonglm
    cuonglm over 5 years
    @StéphaneChazelas Hi Stephane, the link to gmane thread is dead, can you point to other one?