So, I am playing around with multiprocessing.Pool
and Numpy
, but it seems I missed some important point. Why is the pool
version much
Since you mention that you have a lot of files, I would suggest the following solution;
Pool.map()
to apply the function to the list of files.Since every instance now loads its own file, the only data passed around are filenames, not (potentially large) numpy arrays.
Measuring arithmetic throughput is a very difficult task: basically your test case is too simple, and I see many problems.
First you are testing integer arithmetic: is there a special reason? With floating point you get results that are comparable across many different architectures.
Second matrix = matrix*matrix
overwrites the input parameter (matrices are passed by ref and not by value), and each sample has to work on different data...
Last tests should be conducted over a wider range of problem size and number of workers, in order to grasp general trends.
So here is my modified test script
import numpy as np
from timeit import timeit
from multiprocessing import Pool
def mmul(matrix):
mymatrix = matrix.copy()
for i in range(100):
mymatrix *= mymatrix
return mymatrix
if __name__ == '__main__':
for n in (16, 32, 64):
matrices = []
for i in range(n):
matrices.append(np.random.random_sample(size=(1000, 1000)))
stmt = 'from __main__ import mmul, matrices'
print 'testing with', n, 'matrices'
print 'base',
print '%5.2f' % timeit('r = map(mmul, matrices)', setup=stmt, number=1)
stmt = 'from __main__ import mmul, matrices, pool'
for i in (1, 2, 4, 8, 16):
pool = Pool(i)
print "%4d" % i,
print '%5.2f' % timeit('r = pool.map(mmul, matrices)', setup=stmt, number=1)
pool.close()
pool.join()
and my results:
$ python test_multi.py
testing with 16 matrices
base 5.77
1 6.72
2 3.64
4 3.41
8 2.58
16 2.47
testing with 32 matrices
base 11.69
1 11.87
2 9.15
4 5.48
8 4.68
16 3.81
testing with 64 matrices
base 22.36
1 25.65
2 15.60
4 12.20
8 9.28
16 9.04
[UPDATE] I run this example at home on a different computer, obtaining a consistent slow-down:
testing with 16 matrices
base 2.42
1 2.99
2 2.64
4 2.80
8 2.90
16 2.93
testing with 32 matrices
base 4.77
1 6.01
2 5.38
4 5.76
8 6.02
16 6.03
testing with 64 matrices
base 9.92
1 12.41
2 10.64
4 11.03
8 11.55
16 11.59
I have to confess that I do not know who is to blame (numpy, python, compiler, kernel)...