C++: Structs slower to access than basic variables?

后端 未结 9 2456
被撕碎了的回忆
被撕碎了的回忆 2021-02-13 03:45

I found some code that had \"optimization\" like this:

void somefunc(SomeStruct param){
    float x = param.x; // param.x and x are both floats. supposedly this          


        
9条回答
  •  时光取名叫无心
    2021-02-13 04:37

    The real answer is given by Piotr. This one is just for fun.

    I have tested it. This code:

    float somefunc(SomeStruct param, float &sum){
        float x = param.x;
        float y = param.y;
        float z = param.z;
        float xyz = x * y * z;
        sum = x + y + z;
        return xyz;
    }
    

    And this code:

    float somefunc(SomeStruct param, float &sum){
        float xyz = param.x * param.y * param.z;
        sum = param.x + param.y + param.z;
        return xyz;
    }
    

    Generate identical assembly code when compiled with g++ -O2. They do generate different code with optimization turned off, though. Here is the difference:

    <   movl    -32(%rbp), %eax
    <   movl    %eax, -4(%rbp)
    <   movl    -28(%rbp), %eax
    <   movl    %eax, -8(%rbp)
    <   movl    -24(%rbp), %eax
    <   movl    %eax, -12(%rbp)
    <   movss   -4(%rbp), %xmm0
    <   mulss   -8(%rbp), %xmm0
    <   mulss   -12(%rbp), %xmm0
    <   movss   %xmm0, -16(%rbp)
    <   movss   -4(%rbp), %xmm0
    <   addss   -8(%rbp), %xmm0
    <   addss   -12(%rbp), %xmm0
    ---
    >   movss   -32(%rbp), %xmm1
    >   movss   -28(%rbp), %xmm0
    >   mulss   %xmm1, %xmm0
    >   movss   -24(%rbp), %xmm1
    >   mulss   %xmm1, %xmm0
    >   movss   %xmm0, -4(%rbp)
    >   movss   -32(%rbp), %xmm1
    >   movss   -28(%rbp), %xmm0
    >   addss   %xmm1, %xmm0
    >   movss   -24(%rbp), %xmm1
    >   addss   %xmm1, %xmm0
    

    The lines marked < correspond to the version with "optimization" variables. It seems to me that the "optimized" version is even slower than the one with no extra variables. This is to be expected, though, as x, y and z are allocated on the stack, exactly like the param. What's the point of allocating more stack variables to duplicate existing ones?

    If the one who did that "optimization" knew the language better, he would probably have declared those variables as register, but even that leaves the "optimized" version slightly slower and longer, at least on G++/x86-64.

提交回复
热议问题