I need help in understanding multiprocessing.Queue
. The problem I\'m facing is that getting results from queue.get(...)
are hilariously behind compared
I met this problem too. I was sending large numpy arrays (~300MB), and it was so slow at mp.queue.get().
After some look into the python2.7 source code of mp.Queue, I found the slowest part (on unix-like systems) is _conn_recvall()
in socket_connection.c, but I was not looking deeper.
To workaround the problem I build an experimental package FMQ.
This project is inspired by the use of multiprocessing.Queue (mp.Queue). mp.Queue is slow for large data item because of the speed limitation of pipe (on Unix-like systems).
With mp.Queue handling the inter-process transfer, FMQ implements a stealer thread, which steals an item from mp.Queue once any item is available, and puts it into a Queue.Queue. Then, the consumer process can fetch the data from the Queue.Queue immediately.
The speed-up is based on the assumption that both producer and consumer processes are compute-intensive (thus multiprocessing is neccessary) and the data is large (eg. >50 227x227 images). Otherwise mp.Queue with multiprocessing or Queue.Queue with threading is good enough.
fmq.Queue is used easily like a mp.Queue.
Note that there are still some Known Issues, as this project is at its early stage.