python multiprocessing vs threading for cpu bound work on windows and linux

后端 未结 5 1386
臣服心动
臣服心动 2020-12-04 10:45

So I knocked up some test code to see how the multiprocessing module would scale on cpu bound work compared to threading. On linux I get the performance increase that I\'d e

相关标签:
5条回答
  • Currently, your counter() function is not modifying much state. Try changing counter() so that it modifies many pages of memory. Then run a cpu bound loop. See if there is still a large disparity between linux and windows.

    I'm not running python 2.6 right now, so I can't try it myself.

    0 讨论(0)
  • 2020-12-04 11:24

    The python documentation for multiprocessing blames the lack of os.fork() for the problems in Windows. It may be applicable here.

    See what happens when you import psyco. First, easy_install it:

    C:\Users\hughdbrown>\Python26\scripts\easy_install.exe psyco
    Searching for psyco
    Best match: psyco 1.6
    Adding psyco 1.6 to easy-install.pth file
    
    Using c:\python26\lib\site-packages
    Processing dependencies for psyco
    Finished processing dependencies for psyco
    

    Add this to the top of your python script:

    import psyco
    psyco.full()
    

    I get these results without:

    serialrun took 1191.000 ms
    parallelrun took 3738.000 ms
    threadedrun took 2728.000 ms
    

    I get these results with:

    serialrun took 43.000 ms
    parallelrun took 3650.000 ms
    threadedrun took 265.000 ms
    

    Parallel is still slow, but the others burn rubber.

    Edit: also, try it with the multiprocessing pool. (This is my first time trying this and it is so fast, I figure I must be missing something.)

    @print_timing
    def parallelpoolrun(reps):
        pool = multiprocessing.Pool(processes=4)
        result = pool.apply_async(counter, (reps,))
    

    Results:

    C:\Users\hughdbrown\Documents\python\StackOverflow>python  1289813.py
    serialrun took 57.000 ms
    parallelrun took 3716.000 ms
    parallelpoolrun took 128.000 ms
    threadedrun took 58.000 ms
    
    0 讨论(0)
  • 2020-12-04 11:27

    Processes are much more lightweight under UNIX variants. Windows processes are heavy and take much more time to start up. Threads are the recommended way of doing multiprocessing on windows.

    0 讨论(0)
  • 2020-12-04 11:37

    Just starting the pool takes a long time. I have found in 'real world' programs if I can keep a pool open and reuse it for many different processes,passing the reference down through method calls (usually using map.async) then on Linux I can save a few percent but on Windows I can often halve the time taken. Linux is always quicker for my particular problems but even on Windows I get net benefits from multiprocessing.

    0 讨论(0)
  • 2020-12-04 11:39

    It's been said that creating processes on Windows is more expensive than on linux. If you search around the site you will find some information. Here's one I found easily.

    0 讨论(0)
提交回复
热议问题