问题
Is there a way to profile a python process' usage of the GIL? Basically, I want to find out what percentage of the time the GIL is held. The process is single-threaded.
My motivation is that I have some code written in Cython, which uses nogil
. Ideally, I would like to run it in a multi-threaded process, but in order to know if that can potentially be a good idea, I need to know if the GIL is free a significant amount of the time.
I found this related question, from 8 years ago. The sole answer there is "No". Hopefully, things have changed since then.
回答1:
Completely by accident, I found a tool which does just this: gil_load.
It was actually published after I posted the question.
Well done, @chrisjbillington.
>>> import sys, math
>>> import gil_load
>>> gil_load.init()
>>> gil_load.start(output = sys.stdout)
>>> for x in range(1, 1000000000):
... y = math.log(x**math.pi)
[2017-03-15 08:52:26] GIL load: 0.98 (0.98, 0.98, 0.98)
[2017-03-15 08:52:32] GIL load: 0.99 (0.99, 0.99, 0.99)
[2017-03-15 08:52:37] GIL load: 0.99 (0.99, 0.99, 0.99)
[2017-03-15 08:52:43] GIL load: 0.99 (0.99, 0.99, 0.99)
[2017-03-15 08:52:48] GIL load: 1.00 (1.00, 1.00, 1.00)
[2017-03-15 08:52:52] GIL load: 1.00 (1.00, 1.00, 1.00)
<...>
>>> import sys, math
>>> import gil_load
>>> gil_load.init()
>>> gil_load.start(output = sys.stdout)
>>> for x in range(1, 1000000000):
... with open('/dev/null', 'a') as f:
... print(math.log(x**math.pi), file=f)
[2017-03-15 08:53:59] GIL load: 0.76 (0.76, 0.76, 0.76)
[2017-03-15 08:54:03] GIL load: 0.77 (0.77, 0.77, 0.77)
[2017-03-15 08:54:09] GIL load: 0.78 (0.78, 0.78, 0.78)
[2017-03-15 08:54:13] GIL load: 0.80 (0.80, 0.80, 0.80)
[2017-03-15 08:54:19] GIL load: 0.81 (0.81, 0.81, 0.81)
[2017-03-15 08:54:23] GIL load: 0.81 (0.81, 0.81, 0.81)
[2017-03-15 08:54:28] GIL load: 0.81 (0.81, 0.81, 0.81)
[2017-03-15 08:54:33] GIL load: 0.80 (0.80, 0.80, 0.80)
<...>
回答2:
If you are wondering how many times the GIL is taken, you can use gdb breakpoints. For example:
> cat gil_count_example.py
import sys
import threading
from threading import Thread
def worker():
k=0
for j in range(10000000):
k+=j
return
num_threads = int(sys.argv[1])
threads = []
for i in range(num_threads):
t = Thread(target = worker)
t.start()
threads.append(t)
for t in threads:
t.join()
For 3.X break on take_gil
> cgdb --args python3 gil_count_example.py 8 (gdb) b take_gil (gdb) ignore 1 100000000 (gdb) r (gdb) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00007ffff7c85f10 in take_gil at Python-3.4.3/Python/ceval_gil.h:208 breakpoint already hit 1886 times
For 2.X break on PyThread_acquire_lock
> cgdb --args python2 gil_count_example.py 8 (gdb) b PyThread_acquire_lock (gdb) ignore 1 100000000 (gdb) r (gdb) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00000039bacfd410 breakpoint already hit 1584561 times
An efficient poor man's profiler can also be used to profile the wall time spent in functions, I use https://github.com/knielsen/knielsen-pmp
> ./get_stacktrace --max=100 --freq=10 `/sbin/pidof python2` ... 292 71.92% sem_wait:PyThread_acquire_lock
.
> ./get_stacktrace --max=100 --freq=10 `/sbin/pidof python3` ... 557 77.68% pthread_cond_timedwait:take_gil
回答3:
I don't know of such a tool.
But there are some heuristics that can help you guess whether or not going multithreaded would help. As you probably know, the GIL will be released during IO operations, and some calls into native code, especially by 3rd party native modules. If you don't have much code like that, then multithreading is likely not going to help you.
If you do have IO/native code, then you'd probably have to just try it out. Depending on the code base converting the whole thing to take advantage of multiple threads might be a lot of work, so you might want to instead try to apply multithreading to parts where you know IO/native code is getting called, and measuring to see if you get any improvements.
Depending on your use case, multiprocessing could work for cases that are primarily CPU bound. Multiprocessing does add overhead, so it is typically good approach for CPU bound tasks that last a relatively long time (several seconds or longer).
来源:https://stackoverflow.com/questions/42378491/profiling-the-gil