Using Python's multiprocessing.pool.map to manipulate the same integer

后端 未结 1 1492
慢半拍i
慢半拍i 2021-02-06 12:46

Problem

I\'m using Python\'s multiprocessing module to execute functions asynchronously. What I want to do is be able to track the overall progress of m

1条回答
  •  梦毁少年i
    2021-02-06 12:54

    You could use a shared Value:

    import multiprocessing as mp
    
    def add_print(num):
        """
        https://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing
        """
        with lock:
            total.value += 1
        print(total.value)
    
    def setup(t, l):
        global total, lock
        total = t
        lock = l
    
    if __name__ == "__main__":
        total = mp.Value('i', 0)
        lock = mp.Lock()
        nums = range(20)
        pool = mp.Pool(initializer=setup, initargs=[total, lock])
        pool.map(add_print, nums)
    

    The pool initializer calls setup once for each worker subprocess. setup makes total a global variable in the worker process, so total can be accessed inside add_print when the worker calls add_print.

    Note, the number of processes should not exceed the number of CPUs your machine has. If you do, the excess subprocesses will wait around for a CPUs to become available. So don't use processes=20 unless you have 20 or more CPUs. If you don't supply a processes argument, multiprocessing will detect the number of CPUs available and spawn a pool with that many workers for you. The number of tasks (e.g. the length of nums) usually greatly exceeds the number of CPUs. That's fine; the tasks are queued and processed by one of the workers as a worker becomes available.

    0 讨论(0)
提交回复
热议问题