Python program using os.pipe and os.fork() issue

python pipe fork

16,025

Solution 1

Are you using read() without specifying a size, or treating the pipe as an iterator (for line in f)? If so, that's probably the source of your problem - read() is defined to read until the end of the file before returning, rather than just read what is available for reading. That will mean it will block until the child calls close().

In the example code linked to, this is OK - the parent is acting in a blocking manner, and just using the child for isolation purposes. If you want to continue, then either use non-blocking IO as in the code you posted (but be prepared to deal with half-complete data), or read in chunks (eg r.read(size) or r.readline()) which will block only until a specific size / line has been read. (you'll still need to call flush on the child)

It looks like treating the pipe as an iterator is using some further buffer as well, for "for line in r:" may not give you what you want if you need each line to be immediately consumed. It may be possible to disable this, but just specifying 0 for the buffer size in fdopen doesn't seem sufficient.

Heres some sample code that should work:

import os, sys, time

r,w=os.pipe()
r,w=os.fdopen(r,'r',0), os.fdopen(w,'w',0)

pid = os.fork()
if pid:          # Parent
    w.close()
    while 1:
        data=r.readline()
        if not data: break
        print "parent read: " + data.strip()
else:           # Child
    r.close()
    for i in range(10):
        print >>w, "line %s" % i
        w.flush()
        time.sleep(1)

Solution 2

Using

fcntl.fcntl(readPipe, fcntl.F_SETFL, os.O_NONBLOCK)

Before invoking the read() solved both problems. The read() call is no longer blocking and the data is appearing after just a flush() on the writing end.

Solution 3

I see you have solved the problem of blocking i/o and buffering.

A note if you decide to try a different approach: subprocess is the equivalent / a replacement for the fork/exec idiom. It seems like that's not what you're doing: you have just a fork (not an exec) and exchanging data between the two processes -- in this case the multiprocessing module (in Python 2.6+) would be a better fit.

16,025

Author by

Paradox

Updated on July 23, 2022

Comments

Paradox almost 2 years

I've recently needed to write a script that performs an os.fork() to split into two processes. The child process becomes a server process and passes data back to the parent process using a pipe created with os.pipe(). The child closes the 'r' end of the pipe and the parent closes the 'w' end of the pipe, as usual. I convert the returns from pipe() into file objects with os.fdopen.

The problem I'm having is this: The process successfully forks, and the child becomes a server. Everything works great and the child dutifully writes data to the open 'w' end of the pipe. Unfortunately the parent end of the pipe does two strange things:
A) It blocks on the read() operation on the 'r' end of the pipe.
Secondly, it fails to read any data that was put on the pipe unless the 'w' end is entirely closed.

I immediately thought that buffering was the problem and added pipe.flush() calls, but these didn't help.

Can anyone shed some light on why the data doesn't appear until the writing end is fully closed? And is there a strategy to make the read() call non blocking?

This is my first Python program that forked or used pipes, so forgive me if I've made a simple mistake.
- user1066101 almost 15 years
  
  Why aren't you using the subprocess module?
- Paradox almost 15 years
  
  I thought the subprocess module was for invoking a command. I'm forking with two branches of code, one for the child and one for the parent.
- Charlie Martin almost 15 years
  
  You can do pretty much anything with the subprocess module. The old version works fine, but it's very UNIX-looking. Subprocess is a little more clearly mapped to Windows.
- Paradox almost 15 years
  
  I'll look into using subprocess in the future. I found a solution that works for my fork() strategy. Thanks!
Charlie Martin almost 15 years

The "parent" vs "child" thing is part of the essential semantics of starting a subprocess. One is the subprocess, and the other isn't.
user1066101 almost 15 years

While true that fork creates a parent and a child, it isn't essential for creating a subprocess. Open VMS does not work that way. The subprocess module is far simpler than this fork malarkey.
Paradox almost 15 years

This module looks very interesting. Thanks, I'll check it out.
Daniel Pryden almost 14 years

Don't think of fork() as being the equivalent of CreateProcess() in Windows, or the equivalent in VMS, which is basically what the subprocess module is modeled after. fork() is much more like starting a new thread, except that the thread happens to have a different process space (and so you need to communicate with it via pipes instead of shared memory). Using the subprocess module you need to run through the process initialization (such as parsing config files or command-line arguments) twice, while with fork() you don't. As such, fork() can be much more efficient.
Daniel Pryden almost 14 years

+1 for mentioning the difference between fork() (what the OP is trying to do here) and the fork/exec idiom encapsulated by the subprocess module, which is something completely different.
mdaoust almost 8 years

Nice! also, consider replacing while 1: ... with for data in iter(r.readline,""):