Disclaimer: I have gone through loads of multiprocessing
answers on SO and also documents and either the questions were really old (Python 3.X has
The reason your example is not performing well is because you are doing two totally different things.
In your list comprehension, you are mapping f
onto each element of li
.
In the second case, you are splitting your li
list into jobs
chunks and then apply your functon job
s times onto each of those chunks. And now, in f
, n * 100
takes a chunk about a quarter the size of your original list, and multiplies it by 100, i.e., it uses the sequence-repitition operator, so creates a new list 100-times the size of the chunk:
>>> chunk = [1,2,3]
>>> chunk * 10
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
>>>
So basically, you are comparing apples to oranges.
However, multiprocessing already comes with an out-of-the box mapping utility. Here is a better comparison, a script called foo.py:
import time
import multiprocessing as mp
def f(x):
return x * 100
if __name__ == '__main__':
data = list(range(1000000))
start = time.time()
[f(i) for i in data]
stop = time.time()
print(f"List comprehension took {stop - start} seconds")
start = time.time()
with mp.Pool(4) as pool:
result = pool.map(f, data)
stop = time.time()
print(f"Pool.map took {stop - start} seconds")
Now, here's some actual performance results:
(py37) Juans-MBP:test_mp juan$ python foo.py
List comprehension took 0.14193987846374512 seconds
Pool.map took 0.2513458728790283 seconds
(py37) Juans-MBP:test_mp juan$
For this very trivial function, the cost of the inter-process communication will always be higher than the cost of calculating the function serially. So you won't see any gains from multiprocessing. However, a much less trivial function can see gains from multiprocessing.
Here's a trivial example, I simply sleep for a microsecond before multiplying:
import time
import multiprocessing as mp
def f(x):
time.sleep(0.000001)
return x * 100
if __name__ == '__main__':
data = list(range(1000000))
start = time.time()
[f(i) for i in data]
stop = time.time()
print(f"List comprehension took {stop - start} seconds")
start = time.time()
with mp.Pool(4) as pool:
result = pool.map(f, data)
stop = time.time()
print(f"Pool.map took {stop - start} seconds")
And now, you see gains commensurate with the number of processes:
(py37) Juans-MBP:test_mp juan$ python foo.py
List comprehension took 13.175776720046997 seconds
Pool.map took 3.1484851837158203 seconds
Note, on my machine, a single multiplication takes orders of magnitude less time than a microsecond (about 10 nanoseconds):
>>> import timeit
>>> timeit.timeit('100*3', number=int(1e6))*1e-6
1.1292944999993892e-08