How to compare performance of two pieces of codes

前端 未结 7 1390
春和景丽
春和景丽 2020-12-29 11:25

I have a friendly competition with couple of guys in the field of programming and recently we have become so interested in writing efficient code. Our challenge was to try t

相关标签:
7条回答
  • 2020-12-29 12:19

    From the inline-assembly, you can use rdtsc instruction to get 32-bit(least significant part) counter into eax and 32-bit(highest significant part) to edx. If your code is too small, you can check total-approimate cpu-cycles with just eax register. If count is more than max. of 32-bit value, edx increments per max-32-bit value cycle.

    int cpu_clk1a=0;
    int cpu_clk1b=0;
    int cpu_clk2a=0;
    int cpu_clk2b=0;
    int max=0;
    std::cin>>max; //loop limit
    
    __asm
    {
        push eax
        push edx
        rdtsc    //gets current cpu-clock-counter into eax&edx
        mov [cpu_clk1a],eax
        mov [cpu_clk1b],edx
        pop edx
        pop eax
    
    }
    
    long temp=0;
    for(int i=0;i<max;i++)
    {
    
        temp+=clock();//needed to defy optimization to  actually measure something
                              //even the smartest compiler cannot know what 
                              //the clock would be
    }
    
    __asm
    {
        push eax
        push edx
        rdtsc     //gets current cpu-clock-counter into aex&edx
        mov [cpu_clk2a],eax
        mov [cpu_clk2b],edx
        pop edx
        pop eax
    
    }
    std::cout<<(cpu_clk2a-cpu_clk1a)<<std::endl;
       //if your loop takes more than ~2billions of cpu-clocks, use cpu_clk1b and 2b
    getchar();
    getchar();
    

    Output: 74000 cpu-cycles for 1000 iterations and 800000 cpu-cycles for 10000 iterations on my machine. Because clock() is time-consuming.

    Cpu-cycle resolution on my machine: ~1000 cycles. Yes, you need more than several thousands of addition/subtraction(fast instructions) to measure it relatively correct.

    Assuming cpu working frequency being constant, 1000 cpu-cycles is nearly equal to 1 micro-seconds for a 1GHz cpu. You should warm your cpu up before doing this.

    0 讨论(0)
提交回复
热议问题