Determining when a ThreadPool has finished processing a queue

10,152
  1. Call queue.task_done after each task is processed.
  2. Then you can call queue.join() to block the main thread until all tasks have been completed.
  3. To terminate the worker threads, put a sentinel (e.g. None) in the queue, and have foobar_task break out of the while-loop when it receives the sentinel.
  4. I think this is easier to implement with threading.Threads than with a ThreadPool.

import random
import time
import threading
import logging
import Queue

logger=logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)

sentinel=None
queue = Queue.Queue()
num_threads = 5

def foobar_task(queue):
    while True:
        n = queue.get()
        logger.info('task called: {n}'.format(n=n))
        if n is sentinel: break
        n=random.random()
        if n > .25:
            logger.info("task appended to queue")
            queue.put(n)
        queue.task_done()

# set up initial queue
for i in range(num_threads):
    queue.put(i)

threads=[threading.Thread(target=foobar_task,args=(queue,))
         for n in range(num_threads)]
for t in threads:
    t.start()

queue.join()
for i in range(num_threads):
    queue.put(sentinel)

for t in threads:
    t.join()
logger.info("threads are closed")
Share:
10,152

Related videos on Youtube

del
Author by

del

Updated on June 04, 2022

Comments

  • del
    del almost 2 years

    I am trying to implement a thread pool that processes a task queue using ThreadPool and Queue. It begins with an initial queue of tasks, and then each of the tasks may also push additional tasks onto the task queue. The problem is I don't know how to block until the queue is empty and the thread pool has finished processing, but still check the queue and submit any new tasks to the thread pool that were pushed onto the queue. I can't simply call ThreadPool.join(), because I need to keep the pool open for new tasks.

    For example:

    from multiprocessing.pool import ThreadPool
    from Queue import Queue
    from random import random
    import time
    import threading
    
    queue = Queue()
    pool = ThreadPool()
    stdout_lock = threading.Lock()
    
    def foobar_task():
        with stdout_lock: print "task called" 
        if random() > .25:
            with stdout_lock: print "task appended to queue"
            queue.append(foobar_task)
        time.sleep(1)
    
    # set up initial queue
    for n in range(5):
        queue.put(foobar_task)
    
    # run the thread pool
    while not queue.empty():
        task = queue.get() 
        pool.apply_async(task)
    
    with stdout_lock: print "pool is closed"
    pool.close()
    pool.join()
    

    This outputs:

    pool is closed
    task called
    task appended to queue
    task called
    task appended to queue
    task called
    task appended to queue
    task called
    task appended to queue
    task called
    task appended to queue
    

    This exits the while loop before the foobar_tasks have appended to the queue, so the appended tasks are never submitted to the thread pool. I can't find any way to determine if the thread pool still has any active worker threads. I tried the following:

    while not queue.empty() or any(worker.is_alive() for worker in pool._pool):
        if not queue.empty():
            task = queue.get() 
            pool.apply_async(task)
        else:   
            with stdout_lock: print "waiting for worker threads to complete..."
            time.sleep(1)
    

    But it seems that worker.is_alive() always returns true, so this goes into an infinite loop.

    Is there a better way to do this?