I am trying to use multiprocessing for the first time. So I thought I would make a very simple test example which factors 100 different numbers.
from multipr
I came here because my unittest raises
AssertionError: daemonic processes are not allowed to have children
This is because I have used multiprocessing
and I did not close
and join
the pool
properly, after close
and join
everything is fine now.
The problem appears to be that primefac
uses its own multiprocessing.Pool
. Unfortunately, while PyPI is down, I can't find the source to the module—but I did find various forks on GitHub, like this one, and they all have multiprocessing
code.
So, your apparently simple example isn't all that simple—because it's importing and running non-simple code.
By default, all Pool
processes are daemonic, so you can't create more child processes from inside another Pool
. Usually, attempting to do so is a mistake.
If you really do want to multiprocess the factors even though some of them are going to multiprocess their own work (quite possibly adding more contention overhead without adding any parallelism), then you just have to subclass Pool
and override that—as explained in the related question that you linked.
But the simplest thing is to just not use multiprocessing
here, if primefac
is already using your cores efficiently. (If you need quasi-concurrency, getting answers as they come in instead of getting them in sequence, I suppose you could do that with a thread pool, but I don't think there's any advantage to that here—you're not using imap_unordered
or explicit AsyncResult
anywhere.)
Alternatively, if it's not using all of your cores most of the time, only doing so for the "tricky remainders" at the end of factoring some numbers, while you've got 7 cores sitting idle for 60% of the time… then you probably want to prevent primefac
from using multiprocessing
at all. I don't know if the module has a public API for doing that. If so, of course, just use it. If not… well, you may have to subclass or monkeypatch some of its code, or, at worst, monkeypatching its import of multiprocessing
, and that may not be worth doing.
The ideal solution would probably be to refactor primefac
to push the "tricky remainder" jobs onto the same pool you're already using. But that's probably by far the most work, and not that much more benefit.
As a side note, this isn't your problem, but you should have a __main__
guard around your top-level code, like this:
from multiprocessing import Pool
from primefac import factorint
if __name__ == '__main__':
N = 10**30
L = range(N,N + 100)
pool = Pool()
pool.map(factorint, L)
Otherwise, when run with the spawn
or forkserver
startmethods—and notice that spawn
is the only one available on Windows—each pool process is going to try to create another pool of children. So, if you run your code on Windows, you would get this same assertion—as a way for multiprocessing
to protect you from accidentally forkbombing your system.
This is explained under safe importing of main module in the "programming guidelines" section of the multiprocessing
docs.