Python multiprocessing is taking much longer than single processing

后端 未结 2 1825
礼貌的吻别
礼貌的吻别 2021-01-04 07:59

I am performing some large computations on 3 different numpy 2D arrays sequentially. The arrays are huge, 25000x25000 each. Each computation takes significant time so I deci

相关标签:
2条回答
  • 2021-01-04 08:11

    Here is an example using np.memmap and Pool. See that you can define the number of processes and workers. In this case you don't have control over the queue, which can be achieved using multiprocessing.Queue:

    from multiprocessing import Pool
    
    import numpy as np
    
    def mysum(array_file_name, col1, col2, shape):
        a = np.memmap(array_file_name, shape=shape, mode='r+')
        a[:, col1:col2] = np.random.random((shape[0], col2-col1))
        ans = a[:, col1:col2].sum()
        del a
        return ans
    
    if __name__ == '__main__':
        nop = 1000 # number_of_processes
        now = 3 # number of workers
        p = Pool(now)
        array_file_name = 'test.array'
        shape = (250000, 250000)
        a = np.memmap(array_file_name, shape=shape, mode='w+')
        del a
        cols = [[shape[1]*i/nop, shape[1]*(i+1)/nop] for i in range(nop)]
        results = []
        for c1, c2 in cols:
            r = p.apply_async(mysum, args=(array_file_name, c1, c2, shape))
            results.append(r)
        p.close()
        p.join()
    
        final_result = sum([r.get() for r in results])
        print final_result
    

    You can achieve better performances using shared memory parallel processing, when possible. See this related question:

    • Shared-memory objects in python multiprocessing
    0 讨论(0)
  • 2021-01-04 08:26

    my problem appears to be resolved. I was using a django module from inside which I was calling multiprocessing.pool.map_async. My worker function was a function inside the class itself. That was the problem. Multiprocessesing cannot call a function of the same class inside another process because subprocesses do not share memory. So inside the subprocess there is no live instance of the class. Probably that is why it is not getting called. As far as I understood. I removed the function from the class and put it in the same file but outside of the class, just before the class definition starts. It worked. I got moderate speedup also. And One more thing is people who are facing the same problem please do not read large arrays and pass between processes. Pickling and Unpickling would take a lot of time and you won't get speed up rather speed down. Try to read arrays inside the subprocess itself.

    And if possible please use numpy.memmap arrays, they are quite fast.

    0 讨论(0)
提交回复
热议问题