waitpid - WIFEXITED returning 0 although child exited normally
On Unix and Linux systems, the status returned from wait
or waitpid
(or any of the other wait
variants) has this structure:
bits meaning
0-6 signal number that caused child to exit,
or 0177 if child stopped / continued
or zero if child exited without a signal
7 1 if core dumped, else 0
8-15 low 8 bits of value passed to _exit/exit or returned by main,
or signal that caused child to stop/continue
(Note that Posix doesn't define the bits, just macros, but these are the bit definitions used by at least Linux, Mac OS X/iOS, and Solaris. Also note that waitpid
only returns for stop events if you pass it the WUNTRACED
flag and for continue events if you pass it the WCONTINUED
flag.)
So a status of 11 means the child exited due to signal 11, which is SIGSEGV
(again, not Posix but conventionally).
Either your program is passing invalid arguments to execv
(which is a C library wrapper around execve
or some other kernel-specific call), or the child runs differently when you execv
it and when you run it from the shell or gdb.
If you are on a system that supports strace
, run your (parent) program under strace -f
to see whether execv
is causing the signal.
Andreas Grapentin
Updated on June 04, 2020Comments
-
Andreas Grapentin almost 4 years
I have been writing a program that spawns a child process, and calls
waitpid
to wait for the termination of the child process. The code is below:// fork & exec the child pid_t pid = fork(); if (pid == -1) // here is error handling code that is **not** triggered if (!pid) { // binary_invocation is an array of the child process program and its arguments execv(args.binary_invocation[0], (char * const*)args.binary_invocation); // here is some error handling code that is **not** triggered } else { int status = 0; pid_t res = waitpid(pid, &status, 0); // here I see pid_t being a positive integer > 0 // and status being 11, which means WIFEXITED(status) is 0. // this triggers a warning in my programs output. }
The manpage of
waitpid
states forWIFEXITED
:WIFEXITED(status) returns true if the child terminated normally, that is, by calling exit(3) or _exit(2), or by returning from main().
Which I intepret to mean it should return an integer != 0 on success, which is not happening in the execution of my program, since I observe
WIFEXITED(status) == 0
However, executing the same program from the command line results in
$? == 0
, and starting from gdb results in:[Inferior 1 (process 31934) exited normally]
The program behaves normally, except for the triggered warning, which makes me think something else is going on here, that I am missing.
EDIT:
as suggested below in the comments, I checked if the child is terminated via segfault, and indeed,WIFSIGNALED(status)
returns 1, andWTERMSIG(status)
returns 11, which isSIGSEGV
.What I don't understand though, is why a call via execv would fail with a segfault while the same call via gdb, or a shell would succeed?
EDIT2:
The behaviour of my application heavily depends on the behaviour of the child process, in particular on a file the child writes in a function declared__attribute__ ((destructor))
. After thewaitpid
call returns, this file exists and is generated correctly which means the segfault occurs somewhere in another destructor, or somewhere outside of my control.-
rob mayoff about 10 yearsStatus 11 means the child received signal 11,
SIGSEGV
. A non-signal exit is 256 times the low 8 bits the value passed to_exit
orexit
or returned bymain
. If you are on a platform (like Linux) that hasstrace
, use it (with the-f
flag) to see whether the child gets the signal due to a bad call toexecv
, or after a successful exec. -
Andreas Grapentin about 10 years@robmayoff you are right! I was not aware of the fact that the lower byte of the status variable holds exit status of the spawned process, as well as the signal id. thanks for pointing that out!
-
Dabo about 10 years"What I don't understand though, is why a call via execv would fail with a segfault" ... how does
args.binary_invocation
look like ? Where it comes from, you create it ? -
Andreas Grapentin about 10 years@Dabo yes, args.binary_invocation is a NULL terminated array of char pointers that are the name of the child application and its arguments. I have verified that the array is correct.
-
Andreas Grapentin about 10 years@robmayoff I found the reason for the segfault thanks to your comment - the issue was that my application altered the environment for the child process, which I did not reproduce in my independent tests, which is why the segfault was hidden outside of the exec environment. I would like to give you credit for that, because you sent me in the right direction. So, if you would like to make your comments into an answer, I would gladly accept it :)
-