Handle a blocking function call in Python

12,343

Solution 1

First of all - the use of signals should be avoided at all cost:

1) It may lead to a deadlock. SIGALRM may reach the process BEFORE the blocking syscall (imagine super-high load in the system!) and the syscall will not be interrupted. Deadlock.

2) Playing with signals may have some nasty non-local consequences. For example, syscalls in other threads may be interrupted which usually is not what you want. Normally syscalls are restarted when (not a deadly) signal is received. When you set up a signal handler it automatically turns off this behavior for the whole process, or thread group so to say. Check 'man siginterrupt' on that.

Believe me - I met two problems before and they are not fun at all.

In some cases the blocking can be avoided explicitely - I strongly recommend using select() and friends (check select module in Python) to handle blocking writes and reads. This will not solve blocking open() call, though.

For that I've tested this solution and it works well for named pipes. It opens in a non-blocking way, then turns it off and uses select() call to eventually timeout if nothing is available.

import sys, os, select, fcntl

f = os.open(sys.argv[1], os.O_RDONLY | os.O_NONBLOCK)

flags = fcntl.fcntl(f, fcntl.F_GETFL, 0)
fcntl.fcntl(f, fcntl.F_SETFL, flags & ~os.O_NONBLOCK)

r, w, e = select.select([f], [], [], 2.0)

if r == [f]:
    print 'ready'
    print os.read(f, 100)
else:
    print 'unready'

os.close(f)

Test this with:

mkfifo /tmp/fifo
python <code_above.py> /tmp/fifo (1st terminal)
echo abcd > /tmp/fifo (2nd terminal)

With some additional effort select() call can be used as a main loop of the whole program, aggregating all events - you can use libev or libevent, or some Python wrappers around them.

When you can't explicitely force non-blocking behavior, say you just use an external library, then it's going to be much harder. Threads may do, but obviously it is not a state-of-the-art solution, usually being just wrong.

I'm afraid that in general you can't solve this in a robust way - it really depends on WHAT you block.

Solution 2

IIUC, each top_block has a stop method. So you actually can run the top_block in a thread, and issue a stop if the timeout has arrived. It would be better if the top_block's wait() also had a timeout, but alas, it doesn't.

In the main thread, you then need to wait for two cases: a) the top_block completes, and b) the timeout expires. Busy-waits are evil :-), so you should use the thread's join-with-timeout to wait for the thread. If the thread is still alive after the join, you need to stop the top_run.

Solution 3

You can set a signal alarm that will interrupt your call with a timeout:

http://docs.python.org/library/signal.html

signal.alarm(1) # 1 second

my_blocking_call()
signal.alarm(0)

You can also set a signal handler if you want to make sure it won't destroy your application:

def my_handler(signum, frame):
    pass

signal.signal(signal.SIGALRM, my_handler)

EDIT: What's wrong with this piece of code ? This should not abort your application:

import signal, time

def handler(signum, frame):
    print "Timed-out"

def foo():
    # Set the signal handler and a 5-second alarm
    signal.signal(signal.SIGALRM, handler)
    signal.alarm(3)

    # This open() may hang indefinitely
    time.sleep(5)
    signal.alarm(0)          # Disable the alarm


foo()
print "hallo"

The thing is:

  1. The default handler for SIGALRM is to abort the application, if you set your handler then it should no longer stop the application.

  2. Receiving a signal usually interrupts system calls (then unblocks your application)

Solution 4

The easy part of your question relates to the signal handling. From the perspective of the Python runtime a signal which has been received while the interpreter was making a system call is presented to your Python code as an OSError exception with an errno attributed corresponding to errno.EINTR

So this probably works roughly as you intended:

    #!/usr/bin/env python
    import signal, os, errno, time

    def handler(signum, frame):
            # print 'Signal handler called with signal', signum
            #raise IOError("Couldn't open device!")
            print "timed out"
            time.sleep(3)


    def foo():
        # Set the signal handler and a 5-second alarm
        signal.signal(signal.SIGALRM, handler)

        try:
            signal.alarm(3)
            # This open() may hang indefinitely
            fd = os.open('/dev/ttys0', os.O_RDWR)
        except OSError, e:
            if e.errno != errno.EINTR:
                raise e
        signal.alarm(0)          # Disable the alarm

    foo()
    print "hallo"

Note I've moved the import of time out of the function definition as it seems to be poor form to hide imports in that way. It's not at all clear to me why you're sleeping in your signal handler and, in fact, it seems like a rather bad idea.

The key point I'm trying to make is that any (non-ignored) signal will interrupt your main line of Python code execution. Your handler will be invoked with arguments indicating which signal number triggered the execution (allowing for one Python function to be used for handling many different signals) and a frame object (which could be used for debugging or instrumentation of some sort).

Because the main flow through the code is interrupted it's necessary for you to wrap that code in some exception handling in order to regain control after such events have occurred. (Incidentally if you're writing code in C you'd have the same concern; you have to be prepared for any of your library functions with underlying system calls to return errors and handle -EINTR in the system errno by looping back to retry or branching to some alternative in your main line (such as proceeding to some other file, or without any file/input, etc).

As others have indicated in their responses to your question, basing your approach on SIGALARM is likely to be fraught with portability and reliability issues. Worse, some of these issues may be race conditions that you'll never encounter in your testing environment and may only occur under conditions that are extremely hard to reproduce. The ugly details tend to be in cases of re-entrancy --- what happens if signals are dispatched during execution of your signal handler?

I've used SIGALARM in some scripts and it hasn't been an issue for me, under Linux. The code I was working on was suitable to the task. It might be adequate for your needs.

Your primary question is difficult to answer without knowing more about how this Gnuradio code behaves, what sorts of objects you instantiate from it, and what sorts of objects they return.

Glancing at the docs to which you've linked, I see that they don't seem to offer any sort of "timeout" argument or setting that could be used to limit blocking behavior directly. In the table under "Controlling Flow Graphs" I see that they specifically say that .run() can execute indefinitely or until SIGINT is received. I also note that .start() can start threads in your application and, it seems, returns control to your Python code line while those are running. (That seems to depend on the nature of your flow graphs, which I don't understand sufficiently).

It sounds like you could create your flow graphs, .start() them, and then (after some time processing or sleeping in your main line of Python code) call the .lock() method on your controlling object (tb?). This, I'm guessing, puts the Python representation of the state ... the Python object ... into a quiescent mode to allow you to query the state or, as they say, reconfigure your flow graph. If you call .run() it will call .wait() after it calls .start(); and .wait() will apparently run until either all blocks "indicate they are done" or until you call the object's .stop() method.

So it sounds like you want to use .start() and neither .run() nor .wait(); then call .stop() after doing any other processing (including time.sleep()).

Perhaps something as simple as:

    tb = send_seq_2.top_block()
    tb.start()
    time.sleep(endtime - time.time())
    tb.stop()
    seq1_sent = True
    tb = send_seq_2.top_block()
    tb.start()
    seq2_sent = True

.. though I'm suspicious of my time.sleep() there. Perhaps you want to do something else where you query the tb object's state (perhaps entailing sleeping for smaller intervals, calling its .lock() method, and accessing attributes that I know nothing about and then calling its .unlock() before sleeping again.

Solution 5

if not seq1_sent:
        tb = send_seq_2.top_block()
        tb.Run(True)
        seq1_sent = True
        if time.time() < endtime:
            break

If the 'if time.time() < endtime:' then you will break out of the loop and the seq2_sent stuff will never be hit, maybe you mean 'time.time() > endtime' in that test?

Share:
12,343

Related videos on Youtube

wishi
Author by

wishi

Updated on May 07, 2022

Comments

  • wishi
    wishi about 2 years

    I'm working with the Gnuradio framework. I handle flowgraphs I generate to send/receive signals. These flowgraphs initialize and start, but they don't return the control flow to my application:

    I imported time

    while time.time() < endtime:
            # invoke GRC flowgraph for 1st sequence
            if not seq1_sent:
                tb = send_seq_2.top_block()
                tb.Run(True)
                seq1_sent = True
                if time.time() < endtime:
                    break
    
            # invoke GRC flowgraph for 2nd sequence
            if not seq2_sent:
                tb = send_seq_2.top_block()
                tb.Run(True)
                seq2_sent = True
                if time.time() < endtime:
                    break
    

    The problem is: only the first if statement invokes the flow-graph (that interacts with the hardware). I'm stuck in this. I could use a Thread, but I'm unexperienced how to timeout threads in Python. I doubt that this is possible, because it seems killing threads isn't within the APIs. This script only has to work on Linux...

    How do you handle blocking functions with Python properly - without killing the whole program. Another more concrete example for this problem is:

    import signal, os
    
    def handler(signum, frame):
            # print 'Signal handler called with signal', signum
            #raise IOError("Couldn't open device!")
            import time
            print "wait"
            time.sleep(3)
    
    
    def foo():
        # Set the signal handler and a 5-second alarm
        signal.signal(signal.SIGALRM, handler)
        signal.alarm(3)
    
        # This open() may hang indefinitely
        fd = os.open('/dev/ttys0', os.O_RDWR)
        signal.alarm(0)          # Disable the alarm
    
    
    foo()
    print "hallo"
    

    How do I still get print "hallo". ;)

    Thanks, Marius

  • mripard
    mripard over 13 years
    Every CPU intensive task in threads should be avoided though, because of the GIL.
  • Martin v. Löwis
    Martin v. Löwis over 13 years
    That doesn't meet his requirements. If the timeout has passed, join fails - but he actually would want to abort the thread, which isn't possible.
  • wishi
    wishi over 13 years
    hmh, it seems when I do that I lose the whole program instead of just ending the blocked calls.
  • Antoine Pelisse
    Antoine Pelisse over 13 years
    Have you tried my piece of code ? Does it abort your application ?
  • Tommy
    Tommy over 5 years
    did you mean 3-second alarm?