Linux, waitpid, WNOHANG and zombies

c++ c fork wait waitpid

10,569

Solution 1

A common solution to #2 is to open a pipe prior to the fork(), then write to it in the child following the exec. In the parent, a successful read means the exec failed; an unsuccessful read means the exec succeeded and the write never took place.

// ignoring all errors except from execvp...
int execpipe[2];
pipe(execpipe);
fcntl(execpipe[1], F_SETFD, fcntl(execpipe[1], F_GETFD) | FD_CLOEXEC);
if(fork() == 0)
{
    close(execpipe[0]);
    execvp(...); // on success, never returns
    write(execpipe[1], &errno, sizeof(errno));
    // doesn't matter what you exit with
    _exit(0);
}
else
{
    close(execpipe[1]);
    int childErrno;
    if(read(execpipe[0], &childErrno, sizeof(childErrno)) == sizeof(childErrno))
    {
        // exec failed, now we have the child's errno value
        // e.g. ENOENT
    }
}

This lets the parent unambiguously know whether the exec was successful, and as a byproduct what the errno value was if unsuccessful.

If the exec was successful, the child process may still fail with an exit code, and examining the status with the WEXITSTATUS macro give you that condition as well.

NOTE: Calling waitpid with the WNOHANG flag is nonblocking, and you may need to poll the process until a valid pid is returned.

Solution 2

An exec call shouldn't return at all if it succeeds, because it replaces the current process image with another one, so if it does it means an error has occurred:

execvp(...);
/* exec failed and you should exit the
   child process here with an error */
exit(errno);

To let the parent process know if exec failed you should read the status of the child process:

waitpid(pid, &status, WNOHANG);

And then use the WEXITSTATUS(status) macro, from the man page:

WEXITSTATUS(status) returns the exit status of the child. This consists of the least significant 8 bits of the status argument that the child specified in a call to exit(3) or _exit(2) or as the argument for a return statement in main()

Note the last statement means if exec succeeds and runs the command you will get the exit status of the main() function of that command, in other words you can't reliably tell the difference between a failed exec and a failed command this way, so it depends if that matters to you.

Another issue:

if the sign '&' does not exist at the end of the command then the parent need to wait for the child to execute.

You need to call wait() on the child process at some point in your program, regardless of the & to avoid leaving the child process in a zombie state,

Note: When you use the WNOHANG it means that waitpid() will return immediately if no process has changed its state, i.e. it will not block, I assume you know that, otherwise use wait() or call waitpid() as part of your main loop.

10,569

Author by

m1o2

.Net Developer at CodeValue.

Updated on June 12, 2022

Comments

m1o2 almost 2 years
I need to be able to:
1. fork a process and make it execvp (I did that)
2. check if the child process execvp was successful (don't know how)
3. check if the child process finished (having problems)
I'm forking a process and I don't have any way to check if the childs's execvp worked or failed. If it failed I need to be able to know that it failed. Currently I'm using
```
-1 != waitpid( pid, &status, WNOHANG )
```
But it seems that if the execv of the pid process fails the waitpid does not return -1.

How could I check that? I read the waitpid man page, but it isn't clear to me; maybe my English isn't good enough.

EDIT: in order to explain more:
I'm building my own terminal for a Home Work. I need to get as an input a command string, lets say "ls" and then I have to execute the command.
After the child forks, the child calls execvp in order to execute the command ( after I parse the string ), and the parent need to check whether there was a '&' at the end of the command or not.
if the sign '&' does not exist at the end of the command then the parent need to wait for the child to execute.

so I need to know if the execvp failed. If it didn't failed then the parent use waitpid to wait for the child to finish it execution. If it failed then the parent will not wait for the child.