Filling a queue and managing multiprocessing in python

前端 未结 2 1882
暗喜
暗喜 2020-11-28 10:15

I\'m having this problem in python:

  • I have a queue of URLs that I need to check from time to time
  • if the queue is filled up, I need to process each it
相关标签:
2条回答
  • 2020-11-28 10:26

    Added some code (submitting "None" to the queue) to nicely shut down the worker threads, and added code to close and join the_queue and the_pool:

    import multiprocessing
    import os
    import time
    
    NUM_PROCESSES = 20
    NUM_QUEUE_ITEMS = 20  # so really 40, because hello and world are processed separately
    
    
    def worker_main(queue):
        print(os.getpid(),"working")
        while True:
            item = queue.get(block=True) #block=True means make a blocking call to wait for items in queue
            if item is None:
                break
    
            print(os.getpid(), "got", item)
            time.sleep(1) # simulate a "long" operation
    
    
    def main():
        the_queue = multiprocessing.Queue()
        the_pool = multiprocessing.Pool(NUM_PROCESSES, worker_main,(the_queue,))
                
        for i in range(NUM_QUEUE_ITEMS):
            the_queue.put("hello")
            the_queue.put("world")
        
        for i in range(NUM_PROCESSES):
            the_queue.put(None)
    
        # prevent adding anything more to the queue and wait for queue to empty
        the_queue.close()
        the_queue.join_thread()
    
        # prevent adding anything more to the process pool and wait for all processes to finish
        the_pool.close()
        the_pool.join()
    
    if __name__ == '__main__':
        main()
    
    0 讨论(0)
  • 2020-11-28 10:29

    You could use the blocking capabilities of queue to spawn multiple process at startup (using multiprocessing.Pool) and letting them sleep until some data are available on the queue to process. If your not familiar with that, you could try to "play" with that simple program:

    import multiprocessing
    import os
    import time
    
    the_queue = multiprocessing.Queue()
    
    
    def worker_main(queue):
        print os.getpid(),"working"
        while True:
            item = queue.get(True)
            print os.getpid(), "got", item
            time.sleep(1) # simulate a "long" operation
    
    the_pool = multiprocessing.Pool(3, worker_main,(the_queue,))
    #                            don't forget the coma here  ^
    
    for i in range(5):
        the_queue.put("hello")
        the_queue.put("world")
    
    
    time.sleep(10)
    

    Tested with Python 2.7.3 on Linux

    This will spawn 3 processes (in addition of the parent process). Each child executes the worker_main function. It is a simple loop getting a new item from the queue on each iteration. Workers will block if nothing is ready to process.

    At startup all 3 process will sleep until the queue is fed with some data. When a data is available one of the waiting workers get that item and starts to process it. After that, it tries to get an other item from the queue, waiting again if nothing is available...

    0 讨论(0)
提交回复
热议问题