Why prefer start + (end - start) / 2 over (start + end) / 2 when calculating the middle of an array?

后端未结

关注

 4  1567

I\'ve seen programmers use the formula

mid = start + (end - start) / 2

instead of using the simpler formula

mid = (start +


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  夕颜        
                
              
                            
                2020-11-29 17:41
              
            
            
                                                                       
There are three reasons.

First of all, start + (end - start) / 2 works even if you are using pointers, as long as end - start doesn't overflow¹.

int *start = ..., *end = ...;
int *mid = start + (end - start) / 2; // works as expected
int *mid = (start + end) / 2;         // type error, won't compile


Second of all, start + (end - start) / 2 won't overflow if start and end are large positive numbers.  With signed operands, overflow is undefined:

int start = 0x7ffffffe, end = 0x7fffffff;
int mid = start + (end - start) / 2; // works as expected
int mid = (start + end) / 2;         // overflow... undefined


(Note that end - start may overflow, but only if start < 0 or end < 0.)

Or with unsigned arithmetic, overflow is defined but gives you the wrong answer.  However, for unsigned operands, start + (end - start) / 2 will never overflow as long as end >= start.

unsigned start = 0xfffffffeu, end = 0xffffffffu;
unsigned mid = start + (end - start) / 2; // works as expected
unsigned mid = (start + end) / 2;         // mid = 0x7ffffffe


Finally, you often want to round towards the start element.

int start = -3, end = 0;
int mid = start + (end - start) / 2; // -2, closer to start
int mid = (start + end) / 2;         // -1, surprise!


Footnotes

¹ According to the C standard, if the result of pointer subtraction is not representable as a ptrdiff_t, then the behavior is undefined.  However, in practice, this requires allocating a char array using at least half the entire address space.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-11-29 17:42
              
            
            
                                                                       
start + (end-start) / 2 can avoid possible overflow, for example start = 2^20 and end = 2^30
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-11-29 17:46
              
            
            
                                                                       
To add to what others have already said, the first one explains its meaning clearer to those less mathematically minded:

mid = start + (end - start) / 2


reads as:


  mid equals start plus half of the length.


whereas:

mid = (start + end) / 2


reads as:


  mid equals half of start plus end


Which does not seem as clear as the first, at least when expressed like that.

as Kos pointed out it can also read:


  mid equals the average of start and end


Which is clearer but still not, at least in my opinion, as clear as the first.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-11-29 17:55
              
            
            
                                                                       
We can take a simple example to demonstrate this fact. Suppose in a certain large array, we are trying to find the midpoint of the range [1000, INT_MAX]. Now, INT_MAX is the largest value the int data type can store. Even if 1 is added to this, the final value will become negative.

Also, start = 1000 and end = INT_MAX.

Using the formula: (start + end)/2, 

the mid-point will be 


  (1000 + INT_MAX)/2 = -(INT_MAX+999)/2, which is negative and may give segmentation fault if we try to index using this value.


But, using the formula, (start + (end-start)/2), we get:


  (1000 + (INT_MAX-1000)/2) = (1000 + INT_MAX/2 - 500) = (INT_MAX/2 + 500) which will not overflow.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复