Parallelism in python isn't working right

前端未结

关注

 3  666

I was developing an app on gae using python 2.7, an ajax call requests some data from an API, a single request could take ~200 ms, however when I open two browsers and make

相关标签:

3条回答

猫巷女王i

2021-01-02 18:38

David Beazley gave a talk about this issue at PyCon 2010. As others have already stated, for some tasks, using threading especially with multiple cores can lead to slower performance than the same task performed by a single thread. The problem, Beazley found, had to do with multiple cores having a "GIL battle":

enter image description here

To avoid GIL contention, you may get better results having the tasks run in separate processes instead of separate threads. The multiprocessing module provides a convenient way to do that especially since multiprocessing API is very similar to the threading API.

import multiprocessing as mp
import datetime as dt
def work():
    t = dt.datetime.now()
    print mp.current_process().name, t
    i = 0
    while i < 100000000:
        i+=1
    t2 = dt.datetime.now()
    print mp.current_process().name, t2, t2-t

if __name__ == '__main__': 
    print "single process:"
    t1 = mp.Process(target=work)
    t1.start()
    t1.join()

    print "multi process:"
    t1 = mp.Process(target=work)
    t1.start()
    t2 = mp.Process(target=work)
    t2.start()
    t1.join()
    t2.join()

yields

single process:
Process-1 2011-12-06 12:34:20.611526
Process-1 2011-12-06 12:34:28.494831 0:00:07.883305
multi process:
Process-3 2011-12-06 12:34:28.497895
Process-2 2011-12-06 12:34:28.503433
Process-2 2011-12-06 12:34:36.458354 0:00:07.954921
Process-3 2011-12-06 12:34:36.546656 0:00:08.048761

PS. As zeekay pointed out in the comments, The GIL battle is only severe for CPU-bound tasks. It should not be a problem for IO-bound tasks.

0 讨论(0)

自闭症患者

2021-01-02 18:44

the CPython interpreter will not allow more then one thread to run. read about GIL http://wiki.python.org/moin/GlobalInterpreterLock

So certain tasks cannot be done concurrently in an efficient way in the CPython with threads.

If you want to do things parallel in GAE, then start them parallel with separate requests.

Also, you may want to consult to the Python parallel wiki http://wiki.python.org/moin/ParallelProcessing

0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2021-01-02 18:50

I would look at where the time is going. Suppose, for example, the server can only answer one query every 200ms. Then there's nothing you can do, you'll only get one reply every 200ms because that's all the server can provide you.

0 讨论(0)
发布评论:

提交评论
- 加载中...