问题
I'm using multiprocessing.Pool in Python on Ubuntu 12.04, and I'm running into a curious problem; When I call map_async
on my Pool, I spawn 8 processes, but they all struggle for dominance over a single core of my 8-core machine. The exact same code uses up both of my cores in my Macbook Pro, and all four cores of my other Ubuntu 12.04 desktop (as measured with htop
, in all cases).
My code is too long to post all of, but the important part is:
P = multiprocessing.Pool()
results = P.map_async( unwrap_self_calc_timepoint, zip([self]*self.xLen,xrange(self.xLen)) ).get(99999999999)
P.close()
P.join()
ipdb.set_trace()
where unwrap_self_calc_timepoint
is a wrapper function to pass the necessary self
argument to a class, based on the advice of this article.
All three computers are using Python 2.7.3, and I don't really know where to start in hunting down why that one Ubuntu computer is acting up. Any help as to how to begin narrowing the problem down would be helpful. Thank you!
回答1:
I had the same problem, in my case the solution was to tell linux to work on the whole processors instead on only one : try adding the 2 following lines at the beginning of your code :
import os
os.system("taskset -p 0xfffff %d" % os.getpid())
回答2:
This seems to be a fairly common issue between numpy and certain Linux distributions. I haven't had any luck using taskset near the start of the program, but it does do the trick when used in the code to be parallelized:
import multiprocessing as mp
import numpy as np
import os
def something():
os.system("taskset -p 0xfffff %d" % os.getpid())
X = np.random.randn(5000,2000)
Y = np.random.randn(2000,5000)
Z = np.dot(X,Y)
return Z.mean()
pool = mp.Pool(processes=10)
out = pool.map(something, np.arange(20))
pool.close()
pool.join()
来源:https://stackoverflow.com/questions/12592018/multiprocessing-pool-processes-locked-to-a-single-core