Multiprocessing.Pool makes Numpy matrix multiplication slower

后端未结

关注

 8  1991

So, I am playing around with multiprocessing.Pool and Numpy, but it seems I missed some important point. Why is the pool version much

相关标签:

8条回答

深忆病人

2020-11-27 20:58
Since you mention that you have a lot of files, I would suggest the following solution;
- Make a list of filenames.
- Write a function that loads and processes a single file named as the input parameter.
- Use Pool.map() to apply the function to the list of files.
Since every instance now loads its own file, the only data passed around are filenames, not (potentially large) numpy arrays.
0 讨论(0)
发布评论:

提交评论
- 加载中...

挽巷

2020-11-27 21:03

Measuring arithmetic throughput is a very difficult task: basically your test case is too simple, and I see many problems.

First you are testing integer arithmetic: is there a special reason? With floating point you get results that are comparable across many different architectures.

~~Second matrix = matrix*matrix overwrites the input parameter (matrices are passed by ref and not by value), and each sample has to work on different data...~~

Last tests should be conducted over a wider range of problem size and number of workers, in order to grasp general trends.

So here is my modified test script

import numpy as np
from timeit import timeit
from multiprocessing import Pool

def mmul(matrix):
    mymatrix = matrix.copy()
    for i in range(100):
        mymatrix *= mymatrix
    return mymatrix

if __name__ == '__main__':

    for n in (16, 32, 64):
        matrices = []
        for i in range(n):
            matrices.append(np.random.random_sample(size=(1000, 1000)))

        stmt = 'from __main__ import mmul, matrices'
        print 'testing with', n, 'matrices'
        print 'base',
        print '%5.2f' % timeit('r = map(mmul, matrices)', setup=stmt, number=1)

        stmt = 'from __main__ import mmul, matrices, pool'
        for i in (1, 2, 4, 8, 16):
            pool = Pool(i)
            print "%4d" % i, 
            print '%5.2f' % timeit('r = pool.map(mmul, matrices)', setup=stmt, number=1)
            pool.close()
            pool.join()

and my results:

$ python test_multi.py 
testing with 16 matrices
base  5.77
   1  6.72
   2  3.64
   4  3.41
   8  2.58
  16  2.47
testing with 32 matrices
base 11.69
   1 11.87
   2  9.15
   4  5.48
   8  4.68
  16  3.81
testing with 64 matrices
base 22.36
   1 25.65
   2 15.60
   4 12.20
   8  9.28
  16  9.04

[UPDATE] I run this example at home on a different computer, obtaining a consistent slow-down:

testing with 16 matrices
base  2.42
   1  2.99
   2  2.64
   4  2.80
   8  2.90
  16  2.93
testing with 32 matrices
base  4.77
   1  6.01
   2  5.38
   4  5.76
   8  6.02
  16  6.03
testing with 64 matrices
base  9.92
   1 12.41
   2 10.64
   4 11.03
   8 11.55
  16 11.59

I have to confess that I do not know who is to blame (numpy, python, compiler, kernel)...

0 讨论(0)

上一页 1 2