I\'m trying to use the ProcessPoolExecutor method but it fails. Here is an example(calculation of the big divider of two numbers) of a failed use. I don\'t understand what the m
Change your code to look like this, and it will work:
from time import time
from concurrent.futures import ProcessPoolExecutor
def gcd(pair):
a, b = pair
low = min(a, b)
for i in range(low, 0, -1):
if a % i == 0 and b % i == 0:
return i
numbers = [(1963309, 2265973), (2030677, 3814172),
(1551645, 2229620), (2039045, 2020802)]
def main():
start = time()
pool = ProcessPoolExecutor(max_workers=3)
results = list(pool.map(gcd, numbers))
end = time()
print('Took %.3f seconds' % (end - start))
if __name__ == '__main__':
main()
On systems that support fork()
this isn't necessary, because your script is imported just once, and then each process started ProcessPoolExecutor
will already have a copy of objects in your global namespace like the gcd
function. Once they're forked, they go through a bootstrap process whereby they start running their target function (in this case a worker process loop that accepts jobs from the process pool executor) and they never return to the original code in your main module from which they were forked.
By contrast, if you're using the spawn
-based processes which are the default on Windows and OSX, a new process must be started up from scratch for each worker process, and if they must re-import your module. However, if your module does something like ProcessPoolExecutor
directly in the module body, without guarding it like if __name__ == '__main__':
, then there is no way for them to import your module without also starting a new ProcessPoolExecutor
. So this error you're getting is essentially guarding you from creating an infinite process bomb.
This is mentioned in the docs for ProcessPoolExecutor:
The
__main__
module must be importable by worker subprocesses. This means thatProcessPoolExecutor
will not work in the interactive interpreter.
But they don't really make clear why that is or what it means for the __main__
module to be "importable". When you write a simple script in Python and run it like python foo.py
, your script foo.py
is loaded with a module name of __main__
, as opposed to a module named foo
which you get if you import foo
. For it to be "importable" in this case really means, importable without major side-effects like spawning new processes.