Why does gprof significantly underestimate the program's running time?

喜夏-厌秋 提交于 2019-12-04 07:44:13

I would recommend dropping gprof and switching to oprofile. Any profiling that inserts instrumentation into your program will inherently affect the performance in ways that might skew or invalidate the results. With oprofile you don't have to build your program with profiling support or get special profiling-enabled libraries; it works by statistical methods, basically sampling the instruction pointer (with kernel assistance) and using the sample counts to estimate how much time was spent in each function.

Mike Dunlavey

First, to answer your question, gprof does not count blocked time, so if anything else is going on in the machine at "the same time" you will see that discrepancy. Also, if your program does any I/O, that will not be counted either.

gprof is really only useful for a very restricted class of programs. Here is a list of the issues.

First of all, profiling a program that completes in 2.3 seconds is a little bit ludicrous. You really need a long-running program to get a good measurement of program hot spots, etc. But I digress...

To answer your first question, time (the command-line utility) times the execution of the whole process (including the profiling instrumentation itself). When you enable profiling in your build, the program writes a gmon.out file containing execution times etc. each time you run the program. That is, the hard work of profiling is done each time you run the program. The profiling instrumentation tries hard to separate out its own impact on the time accounting, and in this case, it seems that the profiling itself accounted for 2.34 - 1.18 = 1.16s of the runtime (as reported by time). The gprof program itself essentially just analyzes and reformats the runtime statistics stored in the gmon.out program. To be clear about this, the real profiling happens during your program's run, not during the gprof run.

Finally, the gprof output answers your second question directly. It samples your program's execution at 1/100 sec. intervals and gives credit for that whole 0.01 s to whatever happened to be executing during the sample. If your program doesn't take an exact multiple of 0.01 seconds to execute, you'll get numbers that don't add up to 100%. Again, it must be emphasized that profiling a program that runs this quickly is quite error prone, and this apparent bug would certainly be mitigated by a longer sampling interval (i.e. runtime). Also, gprof's accounting for its own overhead is imperfect, which might further contribute to the seemingly ludicrous 101.33% number.

It's a statistical process, and it's not perfect. You have to interpret the results with a grain of salt.

Good luck!

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!