I have a model code on which kcachegrind/callgrind reports strange results. It is kind of dispatcher function. The dispatcher is called from 4 places; each call says, which
Yes, this is limitation of callgrind format. It doesn't store full trace; it only stores parent-child calls information.
There is a google-perftools project with pprof/libprofiler.so CPU profiler, http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html . libprofiler.so
can get profile with calltraces and it will store every trace event with full backtrace. pprof
is converter of libprofile's output to graphic formats or to callgrind format. In full view the result will be the same as in kcachegrind; but if you will focus on some function, e.g. do_1 using pprof's option focus; it will show accurate calltree when focused on function.