In the process of using gprof to profile a C++ program I\'ve written, I\'ve noticed that the vast majority of execution time is spent in the function \"frame_dummy\". More preci
There's a very good explanation here: http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html . But I'm not sure why your program would spend so much time in frame_dummy, or why it would get called so many times.
Perhaps the debug info in your binary is corrupt in some way, or is getting misread by gprof? Or gprof might get confsued by MPI? Here's something to try: run your program in gdb, and with a breakpoint on the frame_dummy function. See whether it really gets called 24 million times, and if it does, then what it's getting called from.
Also, can you confirm that this is the frame_dummy in crtbegin.o, and not some other frame_dummy?
Here's the source for frame_dummy in crtbegin.c -- by my reading of the code, it should only get called once.
Also, I'm assuming that your program runs and produces the correct result? (In particular, if there's a memory bug in your program, then you can get some pretty odd behavior.)
I encountered the same issue, here is my output from gprof:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
52.00 16.27 16.27 204000 0.08 0.08 frame_dummy
47.46 31.12 14.85 418000 0.04 0.07 f2
0.51 31.28 0.16 21800 0.01 1.42 f1
0.03 31.29 0.01 1980 0.01 14.21 f5
In my case, it got resolved when I compiled with gcc -Os
instead of gcc -O3
:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
53.12 22.24 22.24 200000 0.11 0.11 f4
45.65 41.36 19.11 598000 0.03 0.03 f2
0.69 41.65 0.29 20000 0.01 1.45 f3
0.45 41.84 0.19 39800 0.00 0.32 f1
0.10 41.88 0.04 evaluate
That is, gprof mistook f4
for frame_dummy
.