Python multiprocessing performance

后端 未结 3 955
旧巷少年郎
旧巷少年郎 2021-02-20 04:16

This should be my third and final question regarding my attempts to increase performance on some statistical analysis that I am doing with python. I have 2 versions of my code (

3条回答
  •  春和景丽
    2021-02-20 05:17

    As far as the last part of your question, the Python docs basically say that multiprocessing.lock is a clone of threading.lock. Acquire calls on locks can take a long time because if the lock is already acquired, it will block until the lock is released. This can become a problem when multiple processes are competing for access to the same data, like in your code. Because I can't view your pastebin, I can only guess as to exactly what's going on, but most likely, you're processes are acquiring the lock for long periods of time which stops other processes from running, even if there is plenty of free CPU time. This shouldn't be affected by the GIL as that should only constrain multithreaded applications, not multiprocessed ones. So, how to fix this? My guess is that you have some sort of lock protecting your shared array that is staying locked while the process is doing intensive calculations that take a relatively long time, therefore barring access for other processes, which are subsequently blocking on their lock.acquire() calls. Assuming you have enough RAM, I strongly endorse the answer that suggests storing multiple copies of the array in each process's address space. However, just note that passing large data structures through map can cause unexpected bottlenecks, as it requires picking and depickling.

提交回复
热议问题