Using “reserved” codes for exit status of shell scripts

6,993

Solution 1

There have been several attempts to standardize the meanings of process exit codes. In addition to the one you mention, I know of:

  • the BSDs have sysexits.h which defines meanings for values from 64 on up.

  • GNU grep documents that exit code 0 means at least one match was found, 1 means no matches were found, and 2 means an I/O error occurred; this convention is obviously also useful for other programs for which the distinction between "nothing went wrong but I didn't find anything" and "an I/O error occurred" is meaningful.

  • Many implementations of the C library function system use exit code 127 to indicate the program doesn't exist or failed to start.

  • On Windows, NTSTATUS codes (which are inconveniently scattered all over the 32-bit number space) may be used as exit codes, particularly the ones that indicate a process was terminated due to catastrophic misbehavior (e.g. STATUS_STACK_OVERFLOW).

You can't count on any given program obeying any particular one of these conventions. The only reliable rule is that exit code 0 is success and anything else is some sort of failure. (Note that C89's EXIT_SUCCESS is not guaranteed to have the value zero; however, exit(0) is required to behave identically to exit(EXIT_SUCCESS) even if the values are not the same.)

Solution 2

No exit code has a special meaning, but the value in $? may have a special meaning.

The way Bourne Shell and ksh93 handled and forwarded exit codes and error situations to the shell variable $? is the problem. In contrary to what you list, only the following values for $? have a special meaning:

  • 126 Could not execute the binary even though it exists
  • 127 The specified binary does not exist
  • 128 exit status was == 0 but some unspecified problem exists

In addition, there is an unspecified shell and platform-specific range of $? codes > 128 that is reserved for a program that was interrupted by a signal:

  • Bourne Shell bash and ksh88 use 128 + signal number
  • ksh93 uses 256 + signal number.

Other values do not give problems as they may be distinguished from the shell-special $? values.

In particular, the values 1 and 2 are not used for special conditions but are just exit codes used by builtin commands that could act the same when they are no builtins. So is seems that the pointer to the bash scripting guide you provided is not a good manual as it just lists codes used by bash without commenting whether a specific code is a special value that should be avoided for own scripts.

Newer versions of the Bourne Shell use waitid() instead of waitpid() to wait for the program to exit and waitid() (introduced 1989 for SVr4) uses a better syscall interface (similar to what UNOS used in 1980 already).

As newer Bourne Shell versions encode the exit reason in a separate variable ${.sh.code} / ${.sh.codename} than the exit code that is in ${.sh.status}/ ${.sh.termsig}, see http://schillix.sourceforge.net/man/man1/bosh.1.html, the exit code is not overloaded with special states, and, as a result from using `waitid(), the Bourne Shell now supports returning all 32 bits of the exit code – not just the low 8 bits.

BTW: be careful not to exit(256) or similar from a C-program or shell script, as this results in $? being interpreted as 0 in a classic shell.

Solution 3

For shell scripting, I sometimes in-source the shell equivalent of sysexist.h with shell-reserved exit codes (prefixed with S_EX_), which I've named exit.sh

It's basically:

EX_OK=0 # successful termination 
EX__BASE=64     # base value for error messages 
EX_USAGE=64     # command line usage error 
EX_DATAERR=65   # data format error 
EX_NOINPUT=66   # cannot open input 
EX_NOUSER=67    # addressee unknown 
EX_NOHOST=68    # host name unknown 
EX_UNAVAILABLE=69       # service unavailable 
EX_SOFTWARE=70  # internal software error 
EX_OSERR=71     # system error (e.g., can't fork) 
EX_OSFILE=72    # critical OS file missing 
EX_CANTCREAT=73 # can't create (user) output file 
EX_IOERR=74     # input/output error 
EX_TEMPFAIL=75  # temp failure; user is invited to retry 
EX_PROTOCOL=76  # remote error in protocol 
EX_NOPERM=77    # permission denied 
EX_CONFIG=78    # configuration error 
EX__MAX=78      # maximum listed value 

#System errors
S_EX_ANY=1      #Catchall for general errors
S_EX_SH=2       #Misuse of shell builtins (according to Bash documentation); seldom seen
S_EX_EXEC=126   #Command invoked cannot execute         Permission problem or command is not an executable
S_EX_NOENT=127  #"command not found"    illegal_command Possible problem with $PATH or a typo
S_EX_INVAL=128  #Invalid argument to exit       exit 3.14159    exit takes only integer args in the range 0 - 255 (see first footnote)                                                                                        
#128+n  Fatal error signal "n"  kill -9 $PPID of script $? returns 137 (128 + 9)                               
#255*   Exit status out of range        exit -1 exit takes only integer args in the range 0 - 255              
S_EX_HUP=129                                                                                                   
S_EX_INT=130   
#...

And can be generated with:

#!/bin/sh
src=/usr/include/sysexits.h
echo "# Generated from \"$src\"" 
echo "# Please inspect the source file for more detailed descriptions"
echo
< "$src" sed -rn 's/^#define  *(\w+)\s*(\d*)/\1=\2/p'| sed 's:/\*:#:; s:\*/::'
cat<<'EOF'

#System errors
S_EX_ANY=1  #Catchall for general errors
S_EX_SH=2   #Misuse of shell builtins (according to Bash documentation); seldom seen
S_EX_EXEC=126   #Command invoked cannot execute     Permission problem or command is not an executable
S_EX_NOENT=127  #"command not found"    illegal_command Possible problem with $PATH or a typo
S_EX_INVAL=128  #Invalid argument to exit   exit 3.14159    exit takes only integer args in the range 0 - 255 (see first footnote)
#128+n  Fatal error signal "n"  kill -9 $PPID of script $? returns 137 (128 + 9)
#255*   Exit status out of range    exit -1 exit takes only integer args in the range 0 - 255
EOF
$(which kill) -l |tr ' ' '\n'| awk '{ printf "S_EX_%s=%s\n", $0, 128+NR; }'

I don't use it much, though, but what I do use is a shell function that inverses error codes to their string formats. I've named it exit2str. Assuming you've named the above exit.sh generator exit.sh.sh, the code for exit2str can be generated with (exit2str.sh.sh) :

#!/bin/sh
echo '
exit2str(){
  case "$1" in'
./exit.sh.sh | sed -nEe's|^(S_)?EX_(([^_=]+_?)+)=([0-9]+).*|\4) echo "\1\2";;|p'
echo "
  esac
}"

I use this in the PS1 of my interactive shell so that after each command I run, I can see its exit status and its string form (if it does have a known string form):

[15:58] pjump@laptop:~ 
(0=OK)$ 
[15:59] pjump@laptop:~ 
(0=OK)$ fdsaf
fdsaf: command not found
[15:59] pjump@laptop:~ 
(127=S_NOENT)$ sleep
sleep: missing operand
Try 'sleep --help' for more information.
[15:59] pjump@laptop:~ 
(1=S_ANY)$ sleep 100
^C
[15:59] pjump@laptop:~ 
(130=S_INT)$ sleep 100
^Z
[1]+  Stopped                 sleep 100
[15:59] pjump@laptop:~ 
(148=S_TSTP)$

To get these, you need an insourcable for the exit2str function:

$ ./exit2str.sh.sh > exit2str.sh #Place this somewhere in your PATH

and then use it in your ~/.bashrc to save and translate the exit code on each command prompt and display it your prompt (PS1):

    # ...
    . exit2str.sh
PROMPT_COMMAND='lastStatus=$(st="$?"; echo -n "$st"; str=$(exit2str "$st") && echo "=$str"); # ...'
    PS1="$PS1"'\n($lastStatus)\$'
    # ...                                                                                   

It's quite handy for observing how some programs follow the exit code conventions and some don't, for learning about exit code conventions, or just for being able to see what's going on more readily. Having been using it for some time, I can say that many system-oriented shell scripts do follow the conventions. EX_USAGE is particularly quite common, although other codes, not much. I try to follow the conventions from time to time, although there's always $S_EX_ANY (1) for lazy people (I am one).

Solution 4

The best reference I could find was this: http://tldp.org/LDP/abs/html/exitcodes.html

According to this:

1 is a general catchall for errors, and I've always seen it used for user defined errors.

2 is for misuse of shell built ins, such as a syntax error

To answer your question directly your script will be fine using the reserved error codes, it will function as expected assuming you handle the error based on the error code = 1/2/3.

However, it would possibly be confusing if you encounter anyone who knows and uses the reserved error codes, which seems quite rare.

Another option available to you is to echo the error if there is one and then exit, assuming your script follows the Linux convention of "no news is good news" and echo's nothing on success.

if [ $? -ne 0 ];then
    echo "Error type"
    exit 1
fi

Solution 5

As long as you document your exit codes so that you remember them a year from now when you have to come back and tweak the script you'll be fine. The idea of "reserved exit codes" doesn't really apply anymore other than to say it's customary to use 0 as a success code and anything else as a failure code.

Share:
6,993

Related videos on Youtube

Anthony Geoghegan
Author by

Anthony Geoghegan

Updated on September 18, 2022

Comments

  • Anthony Geoghegan
    Anthony Geoghegan almost 2 years

    I recently came across this list of Exit Codes With Special Meanings from the Advanced Bash-Scripting Guide. They refer to these codes as being reserved and recommend that:

    According to the above table, exit codes 1-2, 126-165, and 255 have special meanings, and should therefore be avoided for user-specified exit parameters.

    A while ago, I wrote a script which used the following exit status codes:

    • 0 - success
    • 1 - incorrect hostname
    • 2 - invalid arguments specified
    • 3 - insufficient user privileges

    When I wrote the script I wasn’t aware of any special exit codes so I simply started at 1 for the first error condition, and incremented the exit status for each successive error type.

    I wrote the script with the intention that at a later stage it could be called by other scripts (which could check for the non-zero exit codes). I haven’t actually done that yet; so far I’ve only run the script from my interactive shell (Bash) and I was wondering what / if any problems could be caused by using my custom exit codes. How relevant/important is the recommendation from the Advanced Bash-Scripting Guide?

    I couldn’t find any corroborating advice in the Bash documentation; its section on Exit Status simply lists the exit codes used by Bash but doesn’t state that any of these are reserved or warn against using them for your own scripts/programs.

  • Random832
    Random832 over 8 years
    "as this results in $? being interpreted as 0 in a classic shell." - or any shell, since it's actually not possible at the operating system level to pass more than 8 bits from exit() - the other bits of the wait status code are not controlled by the exit function; you can't fake a signal exit.
  • schily
    schily over 8 years
    You are mistaken. waitid() returns all 32 bits from the exit code since 1989. You may be testing on a broken Operating system like Linux, where the kernel early destroys the upper 24 bits of the exit value in the kernel.
  • schily
    schily over 8 years
    BTW: I made a bug report against FreeBSD and the Linux kernel for this waitid() bug around late May. The FreeBSD people fixed the problem within 20 hours, the Linux people are not interested in fixing their bug. ... and the Cygwin people says that they are bug by bug Linux compatible ;-)
  • Random832
    Random832 over 8 years
    This behavior is required by the Single Unix Specification. There's a 32-bit value, yes, but that value contains an 8-bit bitfield containing the low 8 bits of the value from _exit. Please link the FreeBSD bug report you are referring to, maybe I'm misunderstanding the issue you describe.
  • schily
    schily over 8 years
    You are mistaken again. The SUSv2 text was OK, but with SUSv3, there has been a bug introduced in the POSIX standard text that was fixed about a year ago and that will appear in issue 7 tc2 that is currently under review. As a hint: POSIX does not define things but rather describes features from existing implementations. As this feature is from SVr4, you need to check e.g. Solaris for the correct behavior.
  • schily
    schily over 8 years
    BTW: the full 32 bit exit code is also required to appear in the siginfo_t * parameter of the SIGCHLD signal handler.
  • Brian Rasmussen
    Brian Rasmussen over 8 years
    The OP tagged the question with bash and mentioned Bash in the text of the question. Bash is a Bourne-derived shell. It does not support ${.sh.} variables. It is true, however, that you say "Bourne" and not "Bourne-derived" (although you do include ksh93).
  • zwol
    zwol over 8 years
    This answer appears to be very specific to your particular variant of some SVR4-derived Unix. Please be clearer about what is portable and what isn't, keeping in mind that there is no such thing as "the" Bourne shell, unless you mean the one that was in V7.
  • schily
    schily over 8 years
    @Dennis Williamson I tried to be as widely based as possible. This is why I mentioned Bourne Shell and ksh93. The OP used the tag /shell-script with means something portable. This is why my answer includes the full spectrum of possibilities. BTW: bash is not Bourne derived but it tried to become a Bourne Shell clone in early days and then copied features from ksh88 and ksh93.
  • schily
    schily over 8 years
    @zwol Why do you believe that my answer that covers the whole spectrum of implementations is specific to a particular implementation? It seems that you never had access to a real UNIX system. If you did, you would know that most UNIX systems include a real Bourne Shell. The latest AIX even comes with a Bourne Shell from 1986 or before. Since OpenSolaris came out in June 2005, a recent version of the Bourne Shell (e.g. with jobcontrol) has become OSS and later was made highly portable - it even works on Cygwin and is noticeable faster than the Cygwin defaultshell. It now has a history editor.
  • zwol
    zwol over 8 years
    On the contrary, I believe it is you who are understating the range of variation here, especially historical variation. You make it sound like /bin/sh can be relied on to behave consistently wrt these special exit codes cross-platform, which is not true. (I do not care whether any particular system's /bin/sh can be said to be a "real Bourne shell". Much more important to know that none of this is in POSIX, and that most of the things you cite as "real Unix systems" don't provide a POSIX-compliant /bin/sh anyway.)
  • philippe lhardy
    philippe lhardy over 8 years
    @swol i woulkd take your both answers as complementary with different point of views. with your combined answers i think most of return code values are covered.
  • Anthony Geoghegan
    Anthony Geoghegan over 8 years
    Thanks. It was difficult to choose one answer over the others but I'm accepting this one since it answered my question while also providing a wide flavour of the the different exit codes in use (with relevant links): it deserves more than the 3 upvotes it currently has.
  • PSkocik
    PSkocik over 8 years
    I do wonder if there's anything like a mapping between an errno code and an exit code to use if the error reported with that errno code results in an error exit. I might need to come up with some reasonable mapping.
  • Anthony Geoghegan
    Anthony Geoghegan over 8 years
    Wow! I wasn't expecting such an elaborate answer. I'll definitely try that out as a good way to see how different commands behave. Thanks.
  • zwol
    zwol over 8 years
    @schily This will be my last word on the subject, but I feel it is necessary to clarify one point: I never said "/bin/sh should be a POSIX shell". What I said was "[various historical Unix systems] don't provide a POSIX-compliant /bin/sh," which is a fact. Your answer would be fine if it admitted that the various shells you talk about are not necessarily /bin/sh on any given system, and therefore someone trying to write a portable shell script cannot rely on any of the behavior you describe.
  • schily
    schily over 8 years
    @zwol What you wrote does not help anybody - it rather confuses people. What I wrote in my answer includes hints to the variations so people who carefully read my answer can write portable scripts. BTW: even on Linux, you cannot rely in a specific shell being installed as /bin/sh. This is why explained all possible behavior and did not only comment the behavior of bash.