问题
I'm trying to profile a few set of functions which implement different versions of the same algorithm in different ways. I've increased the number of times each function is run so that the total time spent in a single function is roughly 1 minute (to reveal performance differences).
Now, running several times the test produces baffling results. There is a huge variability (+- 50 %) between several executions of the same function, and determining which function is the fastest (which is the goal of the test) is nearly impossible because of that.
Is there something special I should take care of before running the tests, so that I get smoother measurements? Failing that, is running the test several times and compute the average for each function the way to go?
回答1:
There are lots of things to check!
First, make sure your functions are actually CPU-bound. If so, make sure you have all CPU throttling, turbo modes, and power-saving modes disabled (in BIOS) for the test. If you still have trouble, try pinning your process to a single core. Disable hyper-threading too perhaps.
The goal of all this is to make sure you get your code running hot on a single core without much interruption. If you're on Linux, you can remove a single core from the OS list of available cores and use that (so there is no chance of interference on that core).
Running the test several times is a good idea, but using the average (arithmetic mean) is not. Instead, use the median or minimum or some other measurement which won't be influenced by outliers. Usually, the occasional long test run can be thrown out entirely (unless you're building a real-time system!).
来源:https://stackoverflow.com/questions/25665090/getting-reliable-performance-measurements-for-short-bits-of-code