问题
I am confused as to what _number_left
is supposed to return. I assumed it was the number tasks remaining in the pool, but it does not appear to provide the correct value in my code. For example, if I have a pool of 10 workers counting to the number 1000, I would expect result._number_left
to countdown from 1000. However, it only tells me I have 40 left until the code is about finished. Am I missing something here?
Code:
import multiprocessing
import time
def do_something(x):
print x
time.sleep(1)
return
def main():
pool = multiprocessing.Pool(10)
result = pool.map_async(do_something, [x for x in range(1000)])
while not result.ready():
print("num left: {}".format(result._number_left))
result.wait(timeout=1)
if __name__ == "__main__":
main()
回答1:
First, _number_left
is an undocumented private attribute of an undocumented class. There's no reason you should expect it to have any particular meaning.
If you look at the source for the undocumented MapResult class, you can see how it's used in CPython 3.6 in particular.
First, it gets initialized:
self._number_left = length//chunksize + bool(length % chunksize)
So, already, it's pretty clear that it's never going to be the length of your iterable; it's going to be the expected number of chunks needed to map the whole iterable. Then, it counts down from there whenever _set
is called, which… well, that's pretty complicated, but it's clearly not once per value.
At any rate, whatever you're trying to do, there probably is a way to actually do it, without peeking at private attributes and guessing at what they might mean. For example, if you just want to get some progress, but can't use imap_unordered
because you need the results in an ordered list at the end, it's pretty easy to build an ordered list out of it: just pass enumerate(iterable)
in, modify or wrap func
to return the index along with the value, and then sort the results that come back.
回答2:
Turns out I needed to add chunksize=1
to my map_async
call. Answer found here.
New Code:
import multiprocessing
import time
def do_something(x):
print x
time.sleep(1)
return
def main():
pool = multiprocessing.Pool(10)
result = pool.map_async(do_something, [x for x in range(1000)], chunksize=1)
while not result.ready():
print("num left: {}".format(result._number_left))
result.wait(timeout=1)
if __name__ == "__main__":
main()
来源:https://stackoverflow.com/questions/49807345/multiprocessing-pool-mapresult-number-left-not-giving-result-i-would-expect