python multiprocessing vs threading for cpu bound work on windows and linux

后端未结

关注

 5  1386

So I knocked up some test code to see how the multiprocessing module would scale on cpu bound work compared to threading. On linux I get the performance increase that I\'d e

相关标签:

5条回答

不要未来只要你来

2020-12-04 11:17

Currently, your counter() function is not modifying much state. Try changing counter() so that it modifies many pages of memory. Then run a cpu bound loop. See if there is still a large disparity between linux and windows.

I'm not running python 2.6 right now, so I can't try it myself.

0 讨论(0)
发布评论:

提交评论
- 加载中...

梦如初夏

2020-12-04 11:24

The python documentation for multiprocessing blames the lack of os.fork() for the problems in Windows. It may be applicable here.

See what happens when you import psyco. First, easy_install it:

C:\Users\hughdbrown>\Python26\scripts\easy_install.exe psyco
Searching for psyco
Best match: psyco 1.6
Adding psyco 1.6 to easy-install.pth file

Using c:\python26\lib\site-packages
Processing dependencies for psyco
Finished processing dependencies for psyco

Add this to the top of your python script:

import psyco
psyco.full()

I get these results without:

serialrun took 1191.000 ms
parallelrun took 3738.000 ms
threadedrun took 2728.000 ms

I get these results with:

serialrun took 43.000 ms
parallelrun took 3650.000 ms
threadedrun took 265.000 ms

Parallel is still slow, but the others burn rubber.

Edit: also, try it with the multiprocessing pool. (This is my first time trying this and it is so fast, I figure I must be missing something.)

@print_timing
def parallelpoolrun(reps):
    pool = multiprocessing.Pool(processes=4)
    result = pool.apply_async(counter, (reps,))

Results:

C:\Users\hughdbrown\Documents\python\StackOverflow>python  1289813.py
serialrun took 57.000 ms
parallelrun took 3716.000 ms
parallelpoolrun took 128.000 ms
threadedrun took 58.000 ms

0 讨论(0)

情深已故

2020-12-04 11:27

Processes are much more lightweight under UNIX variants. Windows processes are heavy and take much more time to start up. Threads are the recommended way of doing multiprocessing on windows.

0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-12-04 11:37

Just starting the pool takes a long time. I have found in 'real world' programs if I can keep a pool open and reuse it for many different processes,passing the reference down through method calls (usually using map.async) then on Linux I can save a few percent but on Windows I can often halve the time taken. Linux is always quicker for my particular problems but even on Windows I get net benefits from multiprocessing.

0 讨论(0)
发布评论:

提交评论
- 加载中...
无人共我

2020-12-04 11:39

It's been said that creating processes on Windows is more expensive than on linux. If you search around the site you will find some information. Here's one I found easily.

0 讨论(0)
发布评论:

提交评论
- 加载中...