Keyboard Interrupts with python's multiprocessing Pool
Solution 1
This is a Python bug. When waiting for a condition in threading.Condition.wait(), KeyboardInterrupt is never sent. Repro:
import threading
cond = threading.Condition(threading.Lock())
cond.acquire()
cond.wait(None)
print "done"
The KeyboardInterrupt exception won't be delivered until wait() returns, and it never returns, so the interrupt never happens. KeyboardInterrupt should almost certainly interrupt a condition wait.
Note that this doesn't happen if a timeout is specified; cond.wait(1) will receive the interrupt immediately. So, a workaround is to specify a timeout. To do that, replace
results = pool.map(slowly_square, range(40))
with
results = pool.map_async(slowly_square, range(40)).get(9999999)
or similar.
Solution 2
From what I have recently found, the best solution is to set up the worker processes to ignore SIGINT altogether, and confine all the cleanup code to the parent process. This fixes the problem for both idle and busy worker processes, and requires no error handling code in your child processes.
import signal
...
def init_worker():
signal.signal(signal.SIGINT, signal.SIG_IGN)
...
def main()
pool = multiprocessing.Pool(size, init_worker)
...
except KeyboardInterrupt:
pool.terminate()
pool.join()
Explanation and full example code can be found at http://noswap.com/blog/python-multiprocessing-keyboardinterrupt/ and http://github.com/jreese/multiprocessing-keyboardinterrupt respectively.
Solution 3
For some reasons, only exceptions inherited from the base Exception
class are handled normally. As a workaround, you may re-raise your KeyboardInterrupt
as an Exception
instance:
from multiprocessing import Pool
import time
class KeyboardInterruptError(Exception): pass
def f(x):
try:
time.sleep(x)
return x
except KeyboardInterrupt:
raise KeyboardInterruptError()
def main():
p = Pool(processes=4)
try:
print 'starting the pool map'
print p.map(f, range(10))
p.close()
print 'pool map complete'
except KeyboardInterrupt:
print 'got ^C while pool mapping, terminating the pool'
p.terminate()
print 'pool is terminated'
except Exception, e:
print 'got exception: %r, terminating the pool' % (e,)
p.terminate()
print 'pool is terminated'
finally:
print 'joining pool processes'
p.join()
print 'join complete'
print 'the end'
if __name__ == '__main__':
main()
Normally you would get the following output:
staring the pool map
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
pool map complete
joining pool processes
join complete
the end
So if you hit ^C
, you will get:
staring the pool map
got ^C while pool mapping, terminating the pool
pool is terminated
joining pool processes
join complete
the end
Solution 4
The voted answer does not tackle the core issue but a similar side effect.
Jesse Noller, the author of the multiprocessing library, explains how to correctly deal with CTRL+C when using multiprocessing.Pool
in a old blog post.
import signal
from multiprocessing import Pool
def initializer():
"""Ignore CTRL+C in the worker process."""
signal.signal(signal.SIGINT, signal.SIG_IGN)
pool = Pool(initializer=initializer)
try:
pool.map(perform_download, dowloads)
except KeyboardInterrupt:
pool.terminate()
pool.join()
Solution 5
Usually this simple structure works for Ctrl-C on Pool :
def signal_handle(_signal, frame):
print "Stopping the Jobs."
signal.signal(signal.SIGINT, signal_handle)
As was stated in few similar posts:
Related videos on Youtube
Fragsworth
Developer of Clicker Heroes, Cloudstone, and other games http://www.clickerheroes.com/ http://www.kongregate.com/games/nexoncls/cloudstone http://armorgames.com/cloudstone-game/15364
Updated on August 30, 2021Comments
-
Fragsworth almost 3 years
How can I handle KeyboardInterrupt events with python's multiprocessing Pools? Here is a simple example:
from multiprocessing import Pool from time import sleep from sys import exit def slowly_square(i): sleep(1) return i*i def go(): pool = Pool(8) try: results = pool.map(slowly_square, range(40)) except KeyboardInterrupt: # **** THIS PART NEVER EXECUTES. **** pool.terminate() print "You cancelled the program!" sys.exit(1) print "\nFinally, here are the results: ", results if __name__ == "__main__": go()
When running the code above, the
KeyboardInterrupt
gets raised when I press^C
, but the process simply hangs at that point and I have to kill it externally.I want to be able to press
^C
at any time and cause all of the processes to exit gracefully.-
Tiago Albineli Motta almost 7 yearsI solved my problem using psutil, you can see the solution here: stackoverflow.com/questions/32160054/…
-
-
Fragsworth almost 15 yearsI tried this, and it doesn't actually terminate the entire set of jobs. It terminates the currently-running jobs, but the script still assigns the remaining jobs in the pool.map call as if everything is normal.
-
Joseph Garvin over 14 yearsIs this bug in the official python tracker anywhere? I'm having trouble finding it but I'm probably just not using the best search terms.
-
Andrey Vlasovskikh about 14 yearsIt seems that this is not a complete solution. If a
KeyboardInterrupt
is arrived whilemultiprocessing
is performing its own IPC data exchange then thetry..catch
will not be activated (obviously). -
Andrey Vlasovskikh about 14 yearsThis bug has been filed as [Issue 8296][1]. [1]: bugs.python.org/issue8296
-
Alexander Ljungberg over 13 yearsHere's a hack which fixes pool.imap() in the same manner, making Ctrl-C possible when iterating over imap. Catch the exception and call pool.terminate() and your program will exit. gist.github.com/626518
-
Ryan C. Thompson over 12 yearsThis doesn't quite fix things. Sometimes I get the expected behavior when I press Control+C, other times not. I'm not sure why, but it looks like maybe The KeyboardInterrupt is received by one of the processes at random, and I only get the correct behavior if the parent process is the one that catches it.
-
bboe over 12 yearsHi John. Your solution doesn't accomplish the same thing as my, yes unfortunately complicated, solution. It hides behind the
time.sleep(10)
in the main process. If you were to remove that sleep, or if you wait until the process attempts to join on the pool, which you have to do in order to guarantee the jobs are complete, then you still suffer from the same problem which is the main process doesn't receive the KeyboardInterrupt while it it waiting on a the polljoin
operation. -
jreese over 12 yearsIn the case of where I used this code in production, the time.sleep() was part of a loop that would check the status of each child process, and then restart certain processes on a delay if necessary. Rather than join() that would wait on all processes to complete, it would check on them individually, ensuring that the master process stayed responsive.
-
bboe over 12 yearsSo it was more a busy wait (maybe with small sleeps between checks) that polled for process completion via another method rather than join? If that's the case, perhaps it would be better to include this code in your blog post, since you can then guarantee that all the workers have completed before attempting to join.
-
MarioVilas about 11 yearsThis would have to be done on each of the worker processes as well, and may still fail if the KeyboardInterrupt is raised while the multiprocessing library is initializing.
-
Walter about 11 yearsThe trick with .get(999999) slows everything down somehow. See below for the link to bryceboe.com with a solution that works.
-
Walter about 11 yearsWorks like a charm. It's a clean solution and not some kind of hack (/me thinks).btw, the trick with .get(99999) as proposed by others hurts performance badly.
-
krethika over 10 yearsthis is OK, but yuo may lose track of errors that occur. returning the error with a stacktrace might work so the parent process can tell that an error occurred, but it still doesn't exit immediately when the error occurs.
-
Paul Price about 10 yearsI've not noticed any performance penalty from using a timeout, though I have been using 9999 instead of 999999. The exception is when an exception that doesn't inherit from the Exception class is raised: then you have to wait until the timeout is hit. The solution to that is to catch all exceptions (see my solution).
-
Cerin about 10 yearsThis doesn't work. Only the children are sent the signal. The parent never receives it, so
pool.terminate()
never gets executed. Having the children ignore the signal accomplishes nothing. @Glenn's answer solves the problem. -
Andy MacKinlay almost 10 yearsMy version of this is at gist.github.com/admackin/003dd646e5fadee8b8d6 ; it doesn't call
.join()
except on interrupt - it simply manually checks the result of.apply_async()
usingAsyncResult.ready()
to see if it is ready, meaning we've cleanly finished. -
Paul Price over 9 yearsI've not noticed any performance penalty, but in my case the
function
is fairly long-lived (hundreds of seconds). -
Ant6n about 9 yearsI've tried to use this work around - and the keyboard interrupt does give me back control of the REPL. But the other spawned processes in the background are not properly terminated; they seem to randomly re-run somehow.
-
trcarden over 8 years@Cerin I was trying to confirm that this solution breaks down somewhere and I found this win.tue.nl/~aeb/linux/lk/lk-10.html#ss10.2. I believe that if the signal is sent to the process group then the signal will be sent to the leader as well as the children. If so then ignoring the signal in all but the leader would make for a pretty nice solution.
-
Bernhard over 8 yearsYou could replace
raise KeyboardInterruptError
with areturn
. You just have to make sure that the child process ends as soon as KeyboardInterrupt is received. The return value seems to be ignored, inmain
still the KeyboardInterrupt is received. -
gaborous over 7 yearsIt works for me, you just have to make sure to put the signal ignoring code only in children's initialization...
-
szx almost 7 yearsThis doesn't work for me with Python 3.6.1 on Windows. I get tons of stack traces and other garbage when I do Ctrl-C, i.e. same as without such workaround. In fact none of the solutions I've tried from this thread seem to work...
-
benathon almost 7 yearsI've found that ProcessPoolExecutor also has the same issue. The only fix I was able to find was to call
os.setpgrp()
from inside the future -
noxdafox almost 7 yearsSure, the only difference is that
ProcessPoolExecutor
does not support initializer functions. On Unix, you could leverage thefork
strategy by disabling the sighandler on the main process before creating the Pool and re-enabling it afterwards. In pebble, I silenceSIGINT
on the child processes by default. I am not aware of the reason they don't do the same with the Python Pools. At the end, the user could re-set theSIGINT
handler in case he/she wants to hurt himself/herself. -
Paul Price over 6 yearsThis solution seems to prevent Ctrl-C from interrupting the main process as well.
-
noxdafox over 6 yearsI just tested on Python 3.5 and it works, what version of Python are you using? What OS?
-
Code Doggo about 6 yearsI just figured this out as well! I honestly think this is the best solution for a problem like this. The accepted solution forces
map_async
onto the user, which I don't particularly like. In many situations, like mine, the main thread needs to wait for the individual processes to finish. This is one of the reasons whymap
exists! -
Code Doggo about 6 yearsThis actually isn't the case anymore, at least from my eyes and experience. If you catch the keyboard exception in the individual child processes and catch it once more in the main process, then you can continue using
map
and all is good.@Linux Cli Aik
provided a solution below that produces this behavior. Usingmap_async
is not always desired if the main thread is depended on the results from the child processes. -
Akos Lukacs almost 5 yearsJehejj, it's still not fixed in 2019. Like doing IO in paralel is a novel idea :/
-
eMTy almost 4 yearsGlorious and complete example
-
Thomas almost 4 yearsIt's tracked here now: bugs.python.org/issue22393
-
michaelvdnest almost 4 yearsExcellent example.
-
Raf almost 4 yearsHi from 2020 ... this works nicely for
imap_unordered
as well. -
amball over 3 yearsThank you. I'm trying to figure out how this generalizes to multiple arguments. In particular, why do you pass
[value]
rather thanvalue
injobs[value] = pool.apply_async(input_function, [value])
? -
Bruce Lamond almost 3 yearsConfirmed this works as expected on Python 3.7.7 on Windows. Thanks for posting!
-
2080 over 2 yearsWould it be possible to have interrupted processes return an intermediate result instead?