Deadlock with big object in multiprocessing.Queue

笑着哭i 提交于 2021-01-28 04:14:44


When you supply a large-enough object into multiprocessing.Queue, the program seems to hang at weird places. Consider this minimal example:

import multiprocessing

def dump_dict(queue, size):
  queue.put({x: x for x in range(size)})
  print("Dump finished")

if __name__ == '__main__':
  SIZE = int(1e5)
  queue = multiprocessing.Queue()
  process = multiprocessing.Process(target=dump_dict, args=(queue, SIZE))

If the SIZE parameter is small-enough (<= 1e4 at least in my case), the whole program runs smoothly without a problem, but once the SIZE is big-enough, the program hangs at weird places. Now, when searching for explanation, i.e. python multiprocessing - process hangs on join for large queue, I have always seen general answers of "you need to consume from the queue". But what seems weird is that the program actually prints Dump finished i.e. reaching the code line after putting the object into the queue. Furthermore using Queue.put_nowait instead of Queue.put did not make a difference.

Finally if you use Process.join(1) instead of Process.join() the whole process finishes with complete dictionary in the queue (i.e. the print(len(..)) line will print 10000).

Can somebody give me a little bit more insight into this?


You need to queue.get() in the parent before you process.join() to prevent a deadlock. The queue has spawned a feeder-thread with its first queue.put() and the MainThread in your worker-process is joining this feeder-thread before exiting. So the worker-process won't exit before the result is flushed to (OS-pipe-)buffer completely, but your result is too big to fit into the buffer and your parent doesn't read from the queue until the worker has exited, resulting in a deadlock.

You see the output of print("Dump finished") because the actual sending happens from the feeder-thread, queue.put() itself just appends to a collections.deque within the worker-process as an intermediate step.

