multiprocessing.pool.MapResult._number_left not giving result I would expect

我怕爱的太早我们不能终老 提交于 2020-02-06 07:34:48

问题


I am confused as to what _number_left is supposed to return. I assumed it was the number tasks remaining in the pool, but it does not appear to provide the correct value in my code. For example, if I have a pool of 10 workers counting to the number 1000, I would expect result._number_left to countdown from 1000. However, it only tells me I have 40 left until the code is about finished. Am I missing something here?

Code:

import multiprocessing
import time


def do_something(x):
    print x
    time.sleep(1)
    return


def main():
    pool = multiprocessing.Pool(10)

    result = pool.map_async(do_something, [x for x in range(1000)])
    while not result.ready():
        print("num left: {}".format(result._number_left))
        result.wait(timeout=1)


if __name__ == "__main__":
    main()

回答1:


First, _number_left is an undocumented private attribute of an undocumented class. There's no reason you should expect it to have any particular meaning.

If you look at the source for the undocumented MapResult class, you can see how it's used in CPython 3.6 in particular.

First, it gets initialized:

self._number_left = length//chunksize + bool(length % chunksize)

So, already, it's pretty clear that it's never going to be the length of your iterable; it's going to be the expected number of chunks needed to map the whole iterable. Then, it counts down from there whenever _set is called, which… well, that's pretty complicated, but it's clearly not once per value.

At any rate, whatever you're trying to do, there probably is a way to actually do it, without peeking at private attributes and guessing at what they might mean. For example, if you just want to get some progress, but can't use imap_unordered because you need the results in an ordered list at the end, it's pretty easy to build an ordered list out of it: just pass enumerate(iterable) in, modify or wrap func to return the index along with the value, and then sort the results that come back.




回答2:


Turns out I needed to add chunksize=1 to my map_async call. Answer found here.

New Code:

import multiprocessing
import time


def do_something(x):
    print x
    time.sleep(1)
    return


def main():
    pool = multiprocessing.Pool(10)

    result = pool.map_async(do_something, [x for x in range(1000)], chunksize=1)
    while not result.ready():
        print("num left: {}".format(result._number_left))
        result.wait(timeout=1)


if __name__ == "__main__":
    main()


来源:https://stackoverflow.com/questions/49807345/multiprocessing-pool-mapresult-number-left-not-giving-result-i-would-expect

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!