When should we call multiprocessing.Pool.join?
No, you don't, but it's probably a good idea if you aren't going to use the pool anymore.
Reasons for calling pool.close
or pool.join
are well said by Tim Peters in this SO post:
As to Pool.close(), you should call that when - and only when - you're never going to submit more work to the Pool instance. So Pool.close() is typically called when the parallelizable part of your main program is finished. Then the worker processes will terminate when all work already assigned has completed.
It's also excellent practice to call Pool.join() to wait for the worker processes to terminate. Among other reasons, there's often no good way to report exceptions in parallelized code (exceptions occur in a context only vaguely related to what your main program is doing), and Pool.join() provides a synchronization point that can report some exceptions that occurred in worker processes that you'd otherwise never see.
hch
Updated on July 27, 2022Comments
-
hch over 1 year
I am using 'multiprocess.Pool.imap_unordered' as following
from multiprocessing import Pool pool = Pool() for mapped_result in pool.imap_unordered(mapping_func, args_iter): do some additional processing on mapped_result
Do I need to call
pool.close
orpool.join
after the for loop? -
RSHAP over 6 yearsis it better to call one before the other?
-
Bamcclur over 6 yearsIt seems that people like to call
pool.close()
first andpool.join()
second. This allows for you to add work between thepool.close()
andpool.join()
that doesn't need to wait for the pool to finish executing. -
Bogd over 6 yearsJust to add to @Bamcclur's comment - it's not just a good idea to call
pool.close()
first, it's actually mandatory. From the docs : One must callclose()
orterminate()
before usingjoin()
. -
agdhruv over 4 years@Bogd But why is it mandatory? Could you answer this question, please?
-
Whip about 4 yearsAn answer to agdhruvs question would be awesome!