How to profile my C++ application on linux

前端 未结 9 1456
北海茫月
北海茫月 2021-01-04 04:27

I would like to profile my c++ application on linux. I would like to find out how much time my application spent on CPU processing vs time spent on block by IO/being idle.

相关标签:
9条回答
  • 2021-01-04 05:01

    If your app simply runs "flat out" (ie it's either using CPU or waiting for I/O) until it exits, and there aren't other processes competing, just do time myapp (or maybe /usr/bin/time myapp, which produces slightly different output to the shell builtin).

    This will get you something like:

    real    0m1.412s
    user    0m1.288s
    sys     0m0.056s
    

    In this case, user+sys (kernel) time account for almost all the real time and there's just 0.068s unaccounted for... (probably time spent initally loading the app and its supporting libs).

    However, if you were to see:

    real    0m5.732s
    user    0m1.144s
    sys     0m0.078s
    

    then your app spent 4.51s not consuming CPU and presumably blocked on IO. Which is the information I think you're looking for.

    However, where this simple analysis technique breaks down is:

    • Apps which wait on a timer/clock or other external stimulus (e.g event-driven GUI apps). It can't distinguish time waiting on the clock and time waiting on disk/network.
    • Multithreaded apps, which need a bit more thinking about to interpret the numbers.
    0 讨论(0)
  • 2021-01-04 05:03

    I can recommend valgrind's callgrind tool in conjunction with KCacheGrind for visualization. KCacheGrind makes it pretty easy to see where the hotspots are.

    Note: It's been too long since I used it, so I'm not sure if you'll be able to get I/O Wait time out of that. Perhaps in conjunction with iostat or pidstat you'll be able to see where all the time was spent.

    0 讨论(0)
  • 2021-01-04 05:06

    See this post.

    And this post.

    Basically, between the time the program starts and when it finishes, it has a call stack. During I/O, the stack terminates in a system call. During computation, it terminates in a typical instruction.

    Either way, if you can sample the stack at random wall-clock times, you can see exactly why it's spending that time.

    The only remaining point is - thousands of samples might give a sense of confidence, but they won't tell you much more than 10 or 20 samples will.

    0 讨论(0)
  • 2021-01-04 05:12

    You might want to check out Zoom, which is a lot more polished and full-featured than oprofile et al. It costs money ($199), but you can get a free 30 day evaluation licence.

    0 讨论(0)
  • 2021-01-04 05:13

    The lackey and/or helgrind tools in valgrind should allow you to do this.

    0 讨论(0)
  • 2021-01-04 05:17

    Check out oprofile. Also for more system-level diagnostics, try systemtap.

    0 讨论(0)
提交回复
热议问题