faster implementation of sum ( for Codility test )

前端未结

关注

 22  2137

鱼传尺愫 2021-02-04 11:47

How can the following simple implementation of sum be faster?

private long sum( int [] a, int begin, int end ) {
    if( a == null   ) {
        ret


      
      
        
          22条回答        

        
                    
            
            
                         
                
              
              
                
                   小鲜肉
                                             
                
                
                (楼主)
            
              
              
                2021-02-04 12:44
              

            
            
                        
This won't help you with an O(n^2) algorithm, but you can optimize your sum.

At a previous company, we had Intel come by and give us optimization tips.  They had one non-obvious and somewhat cool trick.  Replace:

long r = 0; 
for( int i =  begin ; i < end ; i++ ) { 
   r+= a[i]; 
} 


with

long r1 = 0, r2 = 0, r3 = 0, r4 = 0; 
for( int i =  begin ; i < end ; i+=4 ) { 
   r1+= a[i];
   r2+= a[i + 1];
   r3+= a[i + 2];
   r4+= a[i + 3];
}
long r = r1 + r2 + r3 + r4;
// Note: need to be clever if array isn't divisible by 4


Why this is faster:
  In the original implementation, your variable r is a bottleneck.  Every time through the loop, you have to pull data from memory array a (which takes a couple cycles), but you can't do multiple pulls in parallel, because the value of r in the next iteration of the loop depends on the value of r in this iteration of the loop.  In the second version, r1, r2, r3, and r4 are independent, so the processor can hyperthread their execution.  Only at the very end do they come together.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它22个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复