How do you use dup2 and fork together?

10,695

First off, let's fix your code so that we add a tiny bit more error-checking to it, and so that it works; replace the bottom bit with:

else if ( pid == 0 ) {
        int fd1[ 2 ];
        pipe( fd1 );
        cout << "Inside child" << endl;

        if ( (pid = fork()) > 0 ) {
            if (dup2( fd1[ 0 ] , 0 ) < 0) {
              cerr << "Err dup2 in child" << endl;
            }
            close( fd1[ 0 ] );
            close( fd1[ 1 ] ); // important; see below
            // Note: /usr/bin, not /bin
            execlp( "/usr/bin/wc", "wc", "-l", NULL );
            cerr << "Err execing in child" << endl;
        }
        else if ( pid == 0 ) {
            cout << "Inside grand child" << endl;
            if (dup2( fd1[ 1 ] , 1 ) < 0) {
              cerr << "Err dup2 in gchild" << endl;
            }
            close( fd1[ 0 ] );
            close( fd1[ 1 ] );
            execlp( "/bin/ps", "ps", "-A", NULL );
            cerr << "Err execing in grandchild" << endl;
        }
}

Now, your questions:

  • Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?

    The filedescriptors 0, 1, and 2 are special in Unix in that they are the standard input, standard output, and standard error. wc reads from standard input, so whatever is duped to 0.

  • Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?

    In general, after a process has had its image swapped out with exec, it will have all the open file descriptors it had before exec. (Except for those descriptors with the CLOSE_ON_EXEC flag set, but ignore that for now) Therefore, if you dup2 something to 0, then wc will read it.

  • Which one of these is closed and left open for wc to receive and process ps's output?

    As shown above, you can close both ends of the pipe in both child and grandchild, and that'll be fine. In fact, standard practice would recommend that you do that. However, the only truly necessary close line in this specific example is the one I comment as "important" - that's closing the write end of the pipe in the child.

    The idea is this: both child and grand-child have both ends of the pipe open when they start. Now, through dup we've connected wc to the read end of the pipe. wc is going to keep sucking on that pipe until all descriptors on the write end of the pipe are closed, at which point it'll see that it came to the end of the file and stop. Now, in the grand-child, we can get away with not closing anything because ps -A isn't going to do anything with any of the descriptors but write to descriptor 1, and after ps -A finishes spitting out stuff about some processes it'll exit, closing everything it had. In the child, we don't really need to close the read descriptor stored in fd[0] because wc isn't going to try to read from anything but descriptor 0. However, we do need to close the write end of the pipe in the child because otherwise wc is just going to sit there with a pipe that's never completely closed.

    As you can see, the reasoning for why we didn't really need any of the close lines except the one marked "important" depend on the details of how wc and ps are going to behave, so the standard practice is to close the end of a pipe you aren't using completely, and keep open the end you are using only with one descriptor. Since you're using dup2 in both processes, that means four close statements as above.

EDIT: Updated the arguments to execlp.

Share:
10,695
ShrimpCrackers
Author by

ShrimpCrackers

Updated on June 15, 2022

Comments

  • ShrimpCrackers
    ShrimpCrackers almost 2 years

    I'm taking an operating systems course and I'm having a hard time how input is redirected with dup2 when you have forks. I wrote this small program to try and get a sense for it but I wasn't successful in passing the output of a grand-child to a child. I am trying to mimick the unix command: ps -A | wc -l. I'm new to Unix, but I believe this should count the lines of the list of running processes I get. So my output should be a single number.

    #include <sys/types.h>
    #include <sys/wait.h>
    #include <unistd.h>
    #include <iostream>
    
    using namespace std;
    
    int main( int argc, char *argv[] )  {
    
        char *searchArg = argv[ 1 ];
        pid_t pid;
    
        if ( ( pid = fork() ) > 0 ) {
            wait( NULL );
            cout << "Inside parent" << endl;
        } 
        else if ( pid == 0 ) {
                int fd1[ 2 ];
                pipe( fd1 );
                cout << "Inside child" << endl;
    
                if ( pid = fork() > 0 ) {
                    dup2( fd1[ 0 ], 0 );
                    close( fd1[ 0 ] );
                    execlp( "/bin/wc", "-l", NULL );
                }
                else if ( pid == 0 ) {
                    cout << "Inside grand child" << endl;
                    execlp( "/bin/ps", "-A", NULL );
                }
            }
        return 0;
    }
    

    I don't have it in the code above, but here is my guess on how things should go down:

    • We need to redirect standard output of command ps -A (whatever is usually printed to the screen, correct?) so that the wc -l command can use it to count the lines.
    • This standard output can be redirected using dup2, like dup2( ?, 1 ) which means redirect standard output to ?. Then you close ?.

    Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?

    • wc somehow receives the standard output.

    Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?

    • Execute wc -l.

    Which one of these is closed and left open for wc to receive and process ps's output? I keep thinking this needs to be thought of backwards since ps needs to give its output to wc...but that doesn't seem to make sense since both child and grand-child are being processed in parallel.

    pipe dream

  • ShrimpCrackers
    ShrimpCrackers over 12 years
    Daniel, thanks for the clear explanation! A big help. I am getting an output now, which is wc showing me the lines, words, and character count. Shouldn't the "-l" argument have only shown me the line count? Also, a related question: I'm access a linux machine through a terminal via puTTy. When I execute ps -A | wc -l on it, I get the result 145. I assume that means there are a total of 145 processes that are likely running. When I execute my program, I get 5 lines. When you execute ps -A through execlp in a program, does it only show the parent and its children?
  • ShrimpCrackers
    ShrimpCrackers over 12 years
    Nm, I found out that execlp(...) had N arguments corresponding to arg[0], arg[1], arg[2]...
  • Daniel Martin
    Daniel Martin over 12 years
    Ah, right, oops. Yeah, it should be execlp( "/usr/bin/wc", "wc", "-l", NULL ); and similarly for ps.
  • ShrimpCrackers
    ShrimpCrackers over 12 years
    Daniel, how would you pass from one pipe to another pipe?
  • Duc Nguyen
    Duc Nguyen over 3 years
    I believe the first if should be if ( (pid = fork()) > 0 ).
  • Daniel Martin
    Daniel Martin over 3 years
    Ah, yes, it should. The old code happened to work by accident because I never tested a case running out of resources where fork() would return less than zero. (so pid wound up always being 1 or 0, but because of how I then use pid that didn't matter)