Why don't C++ compilers optimize away reads and writes to struct data members as opposed to distinct local variables?

后端未结
关注
 1  1904
离开以前 2020-12-31 18:35
I\'m trying to create a local array of some POD values (e.g. double) with fixed max_size that is known at compile time, then read a runtime s

      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   小蘑菇
                                             
                
                
                (楼主)
            
              
              
                2020-12-31 19:03
              

            
            
                        
This is because void process_value(double& ref_value); accepts the argument by reference. The compiler/optimizer assumes aliasing, i.e. that process_value function can change memory accessible through reference ref_value and hence that size member after the array.

The compiler assumes that because the array and size are members of one same object array_wrapper function process_value can potentially cast the reference to the first element (on the first invocation) to the reference to the object (and store it elsewhere) and cast the object to unsigned char and read or replace its entire representation. So that after the function returns the state of the object must be reloaded from memory. 

When size is a stand-alone object on the stack the compiler/optimizer assumes that nothing else could possibly have a reference/pointer to it and caches it in a register.

In Chandler Carruth: Optimizing the Emergent Structures of C++ he explains why the optimizers have difficulty when calling functions accepting reference/pointer arguments. Use reference/pointer function arguments only when absolutely necessary.

If you would like to change the value the more performant option is:

double process_value(double value);


And then:

array_wrapper.arr[i] = process_value(array_wrapper.arr[i]);


This change results in optimal assembly:

.L23:
movsd xmm0, QWORD PTR [rbx]
add rbx, 8
call process_value2(double)
movsd QWORD PTR [rbx-8], xmm0
cmp rbx, rbp
jne .L23


Or:

for(double& val : arr)
    val = process_value(val);

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复