C++: Structs slower to access than basic variables?

后端未结

关注

 9  2456

被撕碎了的回忆 2021-02-13 03:45

I found some code that had \"optimization\" like this:

void somefunc(SomeStruct param){
    float x = param.x; // param.x and x are both floats. supposedly this


      
      
        
          9条回答        

        
                    
            
            
                         
                
              
              
                
                   时光取名叫无心
                                             
                
                
                (楼主)
            
              
              
                2021-02-13 04:37
              

            
            
                        
The real answer is given by Piotr. This one is just for fun.

I have tested it. This code:

float somefunc(SomeStruct param, float &sum){
    float x = param.x;
    float y = param.y;
    float z = param.z;
    float xyz = x * y * z;
    sum = x + y + z;
    return xyz;
}


And this code:

float somefunc(SomeStruct param, float &sum){
    float xyz = param.x * param.y * param.z;
    sum = param.x + param.y + param.z;
    return xyz;
}


Generate identical assembly code when compiled with g++ -O2. They do generate different code with optimization turned off, though. Here is the difference:

<   movl    -32(%rbp), %eax
<   movl    %eax, -4(%rbp)
<   movl    -28(%rbp), %eax
<   movl    %eax, -8(%rbp)
<   movl    -24(%rbp), %eax
<   movl    %eax, -12(%rbp)
<   movss   -4(%rbp), %xmm0
<   mulss   -8(%rbp), %xmm0
<   mulss   -12(%rbp), %xmm0
<   movss   %xmm0, -16(%rbp)
<   movss   -4(%rbp), %xmm0
<   addss   -8(%rbp), %xmm0
<   addss   -12(%rbp), %xmm0
---
>   movss   -32(%rbp), %xmm1
>   movss   -28(%rbp), %xmm0
>   mulss   %xmm1, %xmm0
>   movss   -24(%rbp), %xmm1
>   mulss   %xmm1, %xmm0
>   movss   %xmm0, -4(%rbp)
>   movss   -32(%rbp), %xmm1
>   movss   -28(%rbp), %xmm0
>   addss   %xmm1, %xmm0
>   movss   -24(%rbp), %xmm1
>   addss   %xmm1, %xmm0


The lines marked < correspond to the version with "optimization" variables. It seems to me that the "optimized" version is even slower than the one with no extra variables. This is to be expected, though, as x, y and z are allocated on the stack, exactly like the param. What's the point of allocating more stack variables to duplicate existing ones?

If the one who did that "optimization" knew the language better, he would probably have declared those variables as register, but even that leaves the "optimized" version slightly slower and longer, at least on G++/x86-64.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它9个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复