python Pool with worker Processes

后端 未结 3 2029
北恋
北恋 2020-12-04 12:33

I am trying to use a worker Pool in python using Process objects. Each worker (a Process) does some initialization (takes a non-trivial amount of time), gets passed a serie

相关标签:
3条回答
  • 2020-12-04 13:02

    Since python 3.3 you can use starmap, also for using multiple arguments AND getting back the results in a very simplistic syntax:

    import multiprocessing
    
    nb_cores = multiprocessing.cpu_count()
    
    def caps(nb, letter):
        print('Exec nb:', nb)
        return letter.upper()
    
    if __name__ == '__main__':
    
        multiprocessing.freeze_support() # for Windows, also requires to be in the statement: if __name__ == '__main__'
    
        input_data = ['a','b','c','d','e','f','g','h']
        input_order = [1,2,3,4,5,6,7,8,9]
    
        with multiprocessing.Pool(processes=nb_cores) as pool: # auto closing workers
            results = pool.starmap(caps, zip(input_order, input_data))
    
        print(results)
    
    0 讨论(0)
  • 2020-12-04 13:08

    I would suggest that you use a Queue for this.

    class Worker(Process):
        def __init__(self, queue):
            super(Worker, self).__init__()
            self.queue = queue
    
        def run(self):
            print('Worker started')
            # do some initialization here
    
            print('Computing things!')
            for data in iter(self.queue.get, None):
                # Use data
    

    Now you can start a pile of these, all getting work from a single queue

    request_queue = Queue()
    for i in range(4):
        Worker(request_queue).start()
    for data in the_real_source:
        request_queue.put(data)
    # Sentinel objects to allow clean shutdown: 1 per worker.
    for i in range(4):
        request_queue.put(None) 
    

    That kind of thing should allow you to amortize the expensive startup cost across multiple workers.

    0 讨论(0)
  • 2020-12-04 13:08

    initializer expects an arbitrary callable that does initilization e.g., it can set some globals, not a Process subclass; map accepts an arbitrary iterable:

    #!/usr/bin/env python
    import multiprocessing as mp
    
    def init(val):
        print('do some initialization here')
    
    def compute(data):
        print('Computing things!')
        return data * data
    
    def produce_data():
        yield -100
        for i in range(10):
            yield i
        yield 100
    
    if __name__=="__main__":
      p = mp.Pool(initializer=init, initargs=('arg',))
      print(p.map(compute, produce_data()))
    
    0 讨论(0)
提交回复
热议问题