Determining when a ThreadPool has finished processing a queue
- Call queue.task_done after each task is processed.
- Then you can call queue.join() to block the main thread until all tasks have been completed.
- To terminate the worker threads, put a sentinel (e.g.
None
) in the queue, and havefoobar_task
break out of thewhile-loop
when it receives the sentinel. - I think this is easier to implement with
threading.Thread
s than with aThreadPool
.
import random
import time
import threading
import logging
import Queue
logger=logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)
sentinel=None
queue = Queue.Queue()
num_threads = 5
def foobar_task(queue):
while True:
n = queue.get()
logger.info('task called: {n}'.format(n=n))
if n is sentinel: break
n=random.random()
if n > .25:
logger.info("task appended to queue")
queue.put(n)
queue.task_done()
# set up initial queue
for i in range(num_threads):
queue.put(i)
threads=[threading.Thread(target=foobar_task,args=(queue,))
for n in range(num_threads)]
for t in threads:
t.start()
queue.join()
for i in range(num_threads):
queue.put(sentinel)
for t in threads:
t.join()
logger.info("threads are closed")
Related videos on Youtube
del
Updated on June 04, 2022Comments
-
del almost 2 years
I am trying to implement a thread pool that processes a task queue using
ThreadPool
andQueue
. It begins with an initial queue of tasks, and then each of the tasks may also push additional tasks onto the task queue. The problem is I don't know how to block until the queue is empty and the thread pool has finished processing, but still check the queue and submit any new tasks to the thread pool that were pushed onto the queue. I can't simply callThreadPool.join()
, because I need to keep the pool open for new tasks.For example:
from multiprocessing.pool import ThreadPool from Queue import Queue from random import random import time import threading queue = Queue() pool = ThreadPool() stdout_lock = threading.Lock() def foobar_task(): with stdout_lock: print "task called" if random() > .25: with stdout_lock: print "task appended to queue" queue.append(foobar_task) time.sleep(1) # set up initial queue for n in range(5): queue.put(foobar_task) # run the thread pool while not queue.empty(): task = queue.get() pool.apply_async(task) with stdout_lock: print "pool is closed" pool.close() pool.join()
This outputs:
pool is closed task called task appended to queue task called task appended to queue task called task appended to queue task called task appended to queue task called task appended to queue
This exits the while loop before the foobar_tasks have appended to the queue, so the appended tasks are never submitted to the thread pool. I can't find any way to determine if the thread pool still has any active worker threads. I tried the following:
while not queue.empty() or any(worker.is_alive() for worker in pool._pool): if not queue.empty(): task = queue.get() pool.apply_async(task) else: with stdout_lock: print "waiting for worker threads to complete..." time.sleep(1)
But it seems that
worker.is_alive()
always returns true, so this goes into an infinite loop.Is there a better way to do this?