问题
Let's say I have a program
import othermodule, concurrent.futures
pool = concurrent.futures.ProcessPoolExecutor()
and then I want to say
fut = pool.submit(othermodule.foo, 5)
print(fut.result())
Official docs say I need to guard these latter two statements with if __name__ == '__main__'
. It's not hard to do, I would just like to know why. foo
lives in othermodule
, and it knows that (foo.__module__ == 'othermodule'
). And 5 is a literal int. Both can be pickled and unpickled without any reference to the module that created the pool. I see no reason why ProcessPoolExecutor has to import it on the other side.
My model is this: you start another python process, pickle othermodule.foo
and 5
, and send them pickled through some IPC method (Queue, Pipe, whatever). The other process unpickles them (importing othermodule
of course, to find foo
's code), and calls foo(5)
, sending the result back (again through pickle and some IPC). Obviously my model is wrong, but I would like to know where it is wrong.
Is maybe the only reason, that on Unix this is solved by forking __main__
, so on Windows (where fork doesn't really work) they did the closest imitation of the procedure, instead of the closest imitation of the intent? In this case, could it be fixed on Windows?
(Yes, I know about Why does Python's multiprocessing module import __main__ when starting a new process on Windows?. In my opinion it answers a slightly different question. But you can try to use its answer to explain to me why the answer to this question must be the same.)
来源:https://stackoverflow.com/questions/38801229/why-processpoolexecutor-on-windows-needs-main-guard-when-submitting-function