Multiprocessing.Pool makes Numpy matrix multiplication slower

后端 未结 8 1991
感动是毒
感动是毒 2020-11-27 20:05

So, I am playing around with multiprocessing.Pool and Numpy, but it seems I missed some important point. Why is the pool version much

相关标签:
8条回答
  • 2020-11-27 20:58

    Since you mention that you have a lot of files, I would suggest the following solution;

    • Make a list of filenames.
    • Write a function that loads and processes a single file named as the input parameter.
    • Use Pool.map() to apply the function to the list of files.

    Since every instance now loads its own file, the only data passed around are filenames, not (potentially large) numpy arrays.

    0 讨论(0)
  • 2020-11-27 21:03

    Measuring arithmetic throughput is a very difficult task: basically your test case is too simple, and I see many problems.

    First you are testing integer arithmetic: is there a special reason? With floating point you get results that are comparable across many different architectures.

    Second matrix = matrix*matrix overwrites the input parameter (matrices are passed by ref and not by value), and each sample has to work on different data...

    Last tests should be conducted over a wider range of problem size and number of workers, in order to grasp general trends.

    So here is my modified test script

    import numpy as np
    from timeit import timeit
    from multiprocessing import Pool
    
    def mmul(matrix):
        mymatrix = matrix.copy()
        for i in range(100):
            mymatrix *= mymatrix
        return mymatrix
    
    if __name__ == '__main__':
    
        for n in (16, 32, 64):
            matrices = []
            for i in range(n):
                matrices.append(np.random.random_sample(size=(1000, 1000)))
    
            stmt = 'from __main__ import mmul, matrices'
            print 'testing with', n, 'matrices'
            print 'base',
            print '%5.2f' % timeit('r = map(mmul, matrices)', setup=stmt, number=1)
    
            stmt = 'from __main__ import mmul, matrices, pool'
            for i in (1, 2, 4, 8, 16):
                pool = Pool(i)
                print "%4d" % i, 
                print '%5.2f' % timeit('r = pool.map(mmul, matrices)', setup=stmt, number=1)
                pool.close()
                pool.join()
    

    and my results:

    $ python test_multi.py 
    testing with 16 matrices
    base  5.77
       1  6.72
       2  3.64
       4  3.41
       8  2.58
      16  2.47
    testing with 32 matrices
    base 11.69
       1 11.87
       2  9.15
       4  5.48
       8  4.68
      16  3.81
    testing with 64 matrices
    base 22.36
       1 25.65
       2 15.60
       4 12.20
       8  9.28
      16  9.04
    

    [UPDATE] I run this example at home on a different computer, obtaining a consistent slow-down:

    testing with 16 matrices
    base  2.42
       1  2.99
       2  2.64
       4  2.80
       8  2.90
      16  2.93
    testing with 32 matrices
    base  4.77
       1  6.01
       2  5.38
       4  5.76
       8  6.02
      16  6.03
    testing with 64 matrices
    base  9.92
       1 12.41
       2 10.64
       4 11.03
       8 11.55
      16 11.59
    

    I have to confess that I do not know who is to blame (numpy, python, compiler, kernel)...

    0 讨论(0)
提交回复
热议问题