Why is the performance of a running program getting better over time?

后端未结

关注

 2  1190

Consider the following code:

#include 
#include 

using Time = std::chrono::high_resolution_clock;
using us = std::chrono::microsec


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2021-02-19 21:48
              
            
            
                                                                       
What you are probably seeing is CPU frequency scaling (throttling).  The CPU goes into a low-frequency state to save power when it isn't being heavily used.  

Just before running your program, the CPU clock speed is probably fairly low, since there is no big load.  When you run your program, the busy loop increases the load, and the CPU clock speed goes up until you hit the maximum clock speed, decreasing your times.

If you run your program several times in a row, you'll probably see the times stay at a lower value after the first run.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  谎友^        
                
              
                            
                2021-02-19 21:49
              
            
            
                                                                       
In you original experiment, there are too many variables than can affect the measurements:  


the use of your processor by other active processes (i.e. scheduling of your OS) 
The question whether your loop is optimized away or not 
The access and buffering to the console.
The initial mode of your CPU (see answer about throtling)


I must admit that I was very skeptical about your observations.  I therefore wrote a small variant using a preallocated vector, to avoid I/O synchronisation effects:

volatile int i, k;  
const int n = 1000000, kmax=200,n_avg=30;
std::vector<long> v(kmax,0); 

for(k = 0; k < kmax; ++k) {
        auto begin = Time::now();
        for (i = 0; i < n; ++i);  // <-- remain thanks to volatile
        auto end = Time::now();
        auto dur = std::chrono::duration_cast<us>(end - begin).count();
        v[k]=dur;  
}


I then ran it several times on ideone (which, given the scale of its use, we can assume that in average the processor whould be in a constantly sollicitated state).  Indeed your observations seemed to be confirmed.  

I guess that this could be related to branch prediction, which should improve through the repetitive patterns.   

I however went on, updated the code slightly and added a loop to repeat the experiment several times. Then I started to get also runs where your observation was not confirmed (i.e. at the end, the time was higher). But it may also be that the many other processes running on the ideone also influence the branch prediction in a different manner.  

So in the end, to conclude anything would require a more cautious experiment, on a machine running this benchmark (and only it) a couple of hours. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复