I am trying to create a class than can run a separate process to go do some work that takes a long time, launch a bunch of these from a main module and then wait for them all to
Instead of attempting to send a method itself (which is impractical), try sending a name of a method to execute.
Provided that each worker runs the same code, it's a matter of a simple getattr(self, task_name)
.
I'd pass tuples (task_name, task_args)
, where task_args
were a dict to be directly fed to the task method:
next_task_name, next_task_args = self.task_q.get()
if next_task_name:
task = getattr(self, next_task_name)
answer = task(**next_task_args)
...
else:
# poison pill, shut down
break
So, the problem was that I was assuming that Python was doing some sort of magic that is somehow different from the way that C++/fork() works. I somehow thought that Python only copied the class, not the whole program into a separate process. I seriously wasted days trying to get this to work because all of the talk about pickle serialization made me think that it actually sent everything over the pipe. I knew that certain things could not be sent over the pipe, but I thought my problem was that I was not packaging things up properly.
This all could have been avoided if the Python docs gave me a 10,000 ft view of what happens when this module is used. Sure, it tells me what the methods of multiprocess module does and gives me some basic examples, but what I want to know is what is the "Theory of Operation" behind the scenes! Here is the kind of information I could have used. Please chime in if my answer is off. It will help me learn.
When you run start a process using this module, the whole program is copied into another process. But since it is not the "__main__
" process and my code was checking for that, it doesn't fire off yet another process infinitely. It just stops and sits out there waiting for something to do, like a zombie. Everything that was initialized in the parent at the time of calling multiprocess.Process() is all set up and ready to go. Once you put something in the multiprocess.Queue or shared memory, or pipe, etc. (however you are communicating), then the separate process receives it and gets to work. It can draw upon all imported modules and setup just as if it was the parent. However, once some internal state variables change in the parent or separate process, those changes are isolated. Once the process is spawned, it now becomes your job to keep them in sync if necessary, either through a queue, pipe, shared memory, etc.
I threw out the code and started over, but now I am only putting one extra function out in the ProcessWorker
, an "execute" method that runs a command line. Pretty simple. I don't have to worry about launching and then closing a bunch of processes this way, which has caused me all kinds of instability and performance issues in the past in C++. When I switched to launching processes at the beginning and then passing messages to those waiting processes, my performance improved and it was very stable.
BTW, I looked at this link to get help, which threw me off because the example made me think that methods were being transported across the queues: http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html The second example of the first section used "next_task()" that appeared (to me) to be executing a task received via the queue.
REF: https://stackoverflow.com/a/14179779
Answer on Jan 6 at 6:03 by David Lynch is not factually correct when he says that he was misled by http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html.
The code and examples provided are correct and work as advertised. next_task()
is executing a task received via the queue -- try and understand what the Task.__call__()
method is doing.
In my case what, tripped me up was syntax errors in my implementation of run()
. It seems that the sub-process will not report this and just fails silently -- leaving things stuck in weird loops! Make sure you have some kind of syntax checker running e.g. Flymake/Pyflakes in Emacs.
Debugging via multiprocessing.log_to_stderr()
F helped me narrow down the problem.