How to avoid a Broken Pipe error when printing a large amount of formatted data?

23,843

Solution 1

head reads from stdout then closes it. This causes print to fail, internally it writes to sys.stdout, now closed.

You can simply catch the IOError and exit silently:

try:
    for pid, uid, pname in data:
        print template.format(pid, uid, pname)
except IOError:
    # stdout is closed, no point in continuing
    # Attempt to close them explicitly to prevent cleanup problems:
    try:
        sys.stdout.close()
    except IOError:
        pass
    try:
        sys.stderr.close()
    except IOError:
        pass

Solution 2

The behavior you are seeing is linked to the buffered output implementation in Python3. The problem can be avoided using the -u option or setting environmental variable PYTHONUNBUFFERED=x. See the man pages for more information on -u.

$ python2.7 testprint.py | echo

Exc: <type 'exceptions.IOError'>
$ python3.5 testprint.py | echo

Exc: <class 'BrokenPipeError'>
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
$ python3.5 -u testprint.py | echo

Exc: <class 'BrokenPipeError'>
$ export PYTHONUNBUFFERED=x
$ python3.5 testprint.py | echo

Exc: <class 'BrokenPipeError'>

Solution 3

In general, I try to catch the most specific error I can get away with. In this case it is BrokenPipeError:

try:
    # I usually call a function here that generates all my output:
    for pid, uid, pname in data:
        print template.format(pid, uid, pname)
except BrokenPipeError as e:
    pass  # Ignore. Something like head is truncating output.
finally:
    sys.stderr.close()

If this is at the end of execution, I find I only need to close sys.stderr. If I don't close sys.stderr, I'll get a BrokenPipeError but without a stack trace.

This seems to be the minimum fix for writing tools that output to pipelines.

Solution 4

Had this problem with Python3 and debug logging piped into head as well. If your script talks to the network or does file IO, simply dropping IOError's is not a good solution. Despite mentions here, I was not able to catch BrokenPipeError for some reason.

Found a blog post talking about restoring the default signal handler for sigpipe: http://newbebweb.blogspot.com/2012/02/python-head-ioerror-errno-32-broken.html

In short, you add the following to your script before the bulk of the output:

if log.isEnabledFor(logging.DEBUG):  # optional
    # set default handler to no-op
    from signal import signal, SIGPIPE, SIG_DFL
    signal(SIGPIPE, SIG_DFL)

This seems to happen with head, but not other programs such as grep---as mentioned head closes stdout. If you don't use head with the script often, it may not be worth worrying about.

Share:
23,843
Thanasis Petsas
Author by

Thanasis Petsas

Updated on October 15, 2020

Comments

  • Thanasis Petsas
    Thanasis Petsas over 3 years

    I am trying to print a list of tuples formatted in my stdout. For this, I use the str.format method. Everything works fine, but when I pipe the output to see the first lines using the head command a IOError occurs.

    Here is my code:

    # creating the data
    data = []$
    for i in range(0,  1000):                                            
      pid = 'pid%d' % i
      uid = 'uid%d' % i
      pname = 'pname%d' % i
      data.append( (pid, uid, pname) )
    
    # find max leghed string for each field
    pids, uids, pnames = zip(*data)
    max_pid = len("%s" % max( pids) )
    max_uid = len("%s" % max( uids) )
    max_pname = len("%s" % max( pnames) )
    
    # my template for the formatted strings
    template = "{0:%d}\t{1:%d}\t{2:%d}" % (max_pid, max_uid, max_pname)
    
    # print the formatted output to stdout
    for pid, uid, pname in data:
      print template.format(pid, uid, pname)
    

    And here is the error I get after running the command: python myscript.py | head

    Traceback (most recent call last):
      File "lala.py", line 16, in <module>
        print template.format(pid, uid, pname)
    IOError: [Errno 32] Broken pipe
    

    Can anyone help me on this?

    I tried to put print in a try-except block to handle the error, but after that there was another message in the console:

    close failed in file object destructor:
    sys.excepthook is missing
    lost sys.stderr
    

    I also tried to flush immediately the data through a two consecutive sys.stdout.write and sys.stdout.flush calls, but nothing happend..