问题
I've developed a simple program and want to evaluate its runtime performance on a real machine, e.g. my MacBook. The source code goes:
#include <stdio.h>
#include <vector>
#include <ctime>
int main () {
auto beg = std::clock () ;
for (int i = 0; i < 1e8; ++ i) {
}
auto end = std::clock () ;
printf ("CPU time used: %lf ms\n", 1000.0*(end-beg)/CLOCKS_PER_SEC) ;
}
It's compiled with gcc and the optimization flag is set to the default. With the help of bash script, I ran it for 1000 times and recorded the runtime by my MacBook, as following:
[130.000000, 136.000000): 0
[136.000000, 142.000000): 1
[142.000000, 148.000000): 234
[148.000000, 154.000000): 116
[154.000000, 160.000000): 138
[160.000000, 166.000000): 318
[166.000000, 172.000000): 139
[172.000000, 178.000000): 40
[178.000000, 184.000000): 11
[184.000000, 190.000000): 3
"[a, b): n" means that the actual runtime of the same program is between a ms and b ms for n times.
It's clear that the real runtime varies greatly and it seems not a normal distribution. Could someone kindly tell me what causes this and how I can evaluate the runtime correctly?
Thanks for responding to this question.
回答1:
Benchmarking is hard!
Short answer: use google benchmark
Long answer: There are many things that will interfere with timings.
- Scheduling (the OS running other things instead of you)
- CPU Scaling (the OS deciding it can save energy by running slower)
- Memory contention (Something else using the memory when you want to)
- Bus contention (Something else talking to a device you want to talk to)
- Cache (The CPU holding on to a value to avoid having to use memory)
- CPU migration. (The OS moving you from one CPU to another)
- Inaccurate clocks (Only CPU clocks are accurate to any degree, but they change if you migrate)
The only way to avoid these effects are to disable CPU scaling, to do "cache-flush" functions (normally just touching a lot of memory before starting), running at high priority, and locking yourself to a single CPU. Even after all that, your timings will still be noisy, so the last thing is simply to repeat a lot, and use the average.
This why tools like google benchmark are probably your best bet.
video from CPPCon
Also available live online
来源:https://stackoverflow.com/questions/60001446/how-to-evaluate-a-programs-runtime