Why does gprof significantly underestimate the program's running time?

I have this program that takes 2.34 seconds to run, and gprof says it only takes 1.18 seconds. I've read answers elsewhere suggesting that gprof can get it wrong if e.g the program is I/O bound, but this program clearly isn't.

This also happens for a useful program I'm trying to profile. It's not specific to this trivial test case.

(Also in this case gprof says that main() takes more than 100% of the program's running time, which is a pretty stupid bug but not really causing problems for me.)

$ cat test.c
int main() {
    int i;
    for (i=0;i<1000000000;i++);
}

$ gcc test.c -o test

$ time ./test

real    0m2.342s
user    0m2.340s
sys 0m0.000s

$ gcc test.c -o test -pg

$ time ./test

real    0m2.342s
user    0m2.340s
sys 0m0.000s

$ gprof test |head
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
101.33      1.18     1.18                             main

 %         the percentage of the total running time of the
time       program used by this function.

I would recommend dropping gprof and switching to oprofile. Any profiling that inserts instrumentation into your program will inherently affect the performance in ways that might skew or invalidate the results. With oprofile you don't have to build your program with profiling support or get special profiling-enabled libraries; it works by statistical methods, basically sampling the instruction pointer (with kernel assistance) and using the sample counts to estimate how much time was spent in each function.

Mike Dunlavey

First, to answer your question, gprof does not count blocked time, so if anything else is going on in the machine at "the same time" you will see that discrepancy. Also, if your program does any I/O, that will not be counted either.

gprof is really only useful for a very restricted class of programs. Here is a list of the issues.

First of all, profiling a program that completes in 2.3 seconds is a little bit ludicrous. You really need a long-running program to get a good measurement of program hot spots, etc. But I digress...

To answer your first question, time (the command-line utility) times the execution of the whole process (including the profiling instrumentation itself). When you enable profiling in your build, the program writes a gmon.out file containing execution times etc. each time you run the program. That is, the hard work of profiling is done each time you run the program. The profiling instrumentation tries hard to separate out its own impact on the time accounting, and in this case, it seems that the profiling itself accounted for 2.34 - 1.18 = 1.16s of the runtime (as reported by time). The gprof program itself essentially just analyzes and reformats the runtime statistics stored in the gmon.out program. To be clear about this, the real profiling happens during your program's run, not during the gprof run.

Finally, the gprof output answers your second question directly. It samples your program's execution at 1/100 sec. intervals and gives credit for that whole 0.01 s to whatever happened to be executing during the sample. If your program doesn't take an exact multiple of 0.01 seconds to execute, you'll get numbers that don't add up to 100%. Again, it must be emphasized that profiling a program that runs this quickly is quite error prone, and this apparent bug would certainly be mitigated by a longer sampling interval (i.e. runtime). Also, gprof's accounting for its own overhead is imperfect, which might further contribute to the seemingly ludicrous 101.33% number.

It's a statistical process, and it's not perfect. You have to interpret the results with a grain of salt.

Good luck!

来源：https://stackoverflow.com/questions/4309831/why-does-gprof-significantly-underestimate-the-programs-running-time

标签

profiling

gprof