How to measure?

前端 未结 2 1450
盖世英雄少女心
盖世英雄少女心 2020-12-07 05:13

when I did performance-tuning, I will first to work in the high-level and try to answer is this cpu-bound or IO-bound?

when I make sure this is the cpu-bound, then I

相关标签:
2条回答
  • 2020-12-07 06:03

    Are you open to a different way of thinking about performance tuning?
    It does not look at I/O vs CPU bound, hotspots, and timers.

    First, think about just one thread. The execution of a thread is much like a tree. There is a main function (the trunk). There are points when subroutines are called (branches). There are terminal instructions (leaves) and blocking calls like I/O (fruit). The total time the program takes is the sum of all the leaves and all the fruit.

    What you want to do is prune the tree, making it as light as possible, without killing it.

    What many people do is weigh (time) the whole thing, and then weigh parts of it, and so on, and hope to find hotspots (leafy branches) that maybe they could trim.

    Another way is 1) select some leaves or fruit at random. 2) from each leaf or fruit, paint a line from it along the branch it is on, all the way back to the trunk. 3) Take note of branches that have >1 lines painted on them. 4) Ask "Do I need this branch?". If you can prune it, do so. You will eliminate the entire weight of the branch, and you did it without weighing it. Then start over.

    That's the idea behind random-pausing. There are certain kinds of problems it will not find, but most of them it will find, quickly, including any that timing threads can find.

    0 讨论(0)
  • 2020-12-07 06:08

    1) Use cachegrind/callgrind/kcachegrind http://valgrind.org/info/tools.html#cachegrind

    pretty useful in terms of analysing memory locality under specific sets of assumptions.

    2) Threading is really painful to profile correctly. Play some with cpusets and process affinities, on modern NUMA systems it becomes critical quickly.

    0 讨论(0)
提交回复
热议问题