Is there any advantage to using pow(x,2) instead of x*x, with x double?

前端 未结 8 842
清酒与你
清酒与你 2020-12-03 04:18

is there any advantage to using this code

double x;
double square = pow(x,2);

instead of this?

double x;
double square = x*         


        
相关标签:
8条回答
  • 2020-12-03 04:44

    FWIW, with gcc-4.2 on MacOS X 10.6 and -O3 compiler flags,

    x = x * x;
    

    and

    y = pow(y, 2);
    

    result in the same assembly code:

    #include <cmath>
    
    void test(double& x, double& y) {
            x = x * x;
            y = pow(y, 2);
    }
    

    Assembles to:

        pushq   %rbp
        movq    %rsp, %rbp
        movsd   (%rdi), %xmm0
        mulsd   %xmm0, %xmm0
        movsd   %xmm0, (%rdi)
        movsd   (%rsi), %xmm0
        mulsd   %xmm0, %xmm0
        movsd   %xmm0, (%rsi)
        leave
        ret
    

    So as long as you're using a decent compiler, write whichever makes more sense to your application, but consider that pow(x, 2) can never be more optimal than the plain multiplication.

    0 讨论(0)
  • 2020-12-03 04:49

    In C++11 there is one case where there is an advantage to using x * x over std::pow(x,2) and that case is where you need to use it in a constexpr:

    constexpr double  mySqr( double x )
    {
          return x * x ;
    }
    

    As we can see std::pow is not marked constexpr and so it is unusable in a constexpr function.

    Otherwise from a performance perspective putting the following code into godbolt shows these functions:

    #include <cmath>
    
    double  mySqr( double x )
    {
          return x * x ;
    }
    
    double  mySqr2( double x )
    {
          return std::pow( x, 2.0 );
    }
    

    generate identical assembly:

    mySqr(double):
        mulsd   %xmm0, %xmm0    # x, D.4289
        ret
    mySqr2(double):
        mulsd   %xmm0, %xmm0    # x, D.4292
        ret
    

    and we should expect similar results from any modern compiler.

    Worth noting that currently gcc considers pow a constexpr, also covered here but this is a non-conforming extension and should not be relied on and will probably change in later releases of gcc.

    0 讨论(0)
  • 2020-12-03 04:55

    IMHO:

    • Code readability
    • Code robustness - will be easier to change to pow(x, 6), maybe some floating point mechanism for a specific processor is implemented, etc.
    • Performance - if there is a smarter and faster way to calculate this (using assembler or some kind of special trick), pow will do it. you won't.. :)

    Cheers

    0 讨论(0)
  • 2020-12-03 04:57

    I would probably choose std::pow(x, 2) because it could make my code refactoring easier. And it would make no difference whatsoever once the code is optimized.

    Now, the two approaches are not identical. This is my test code:

    #include<cmath>
    
    double square_explicit(double x) {
      asm("### Square Explicit");
      return x * x;
    }
    
    double square_library(double x) {
      asm("### Square Library");  
      return std::pow(x, 2);
    }
    

    The asm("text"); call simply writes comments to the assembly output, which I produce using (GCC 4.8.1 on OS X 10.7.4):

    g++ example.cpp -c -S -std=c++11 -O[0, 1, 2, or 3]
    

    You don't need -std=c++11, I just always use it.

    First: when debugging (with zero optimization), the assembly produced is different; this is the relevant portion:

    # 4 "square.cpp" 1
        ### Square Explicit
    # 0 "" 2
        movq    -8(%rbp), %rax
        movd    %rax, %xmm1
        mulsd   -8(%rbp), %xmm1
        movd    %xmm1, %rax
        movd    %rax, %xmm0
        popq    %rbp
    LCFI2:
        ret
    LFE236:
        .section __TEXT,__textcoal_nt,coalesced,pure_instructions
        .globl __ZSt3powIdiEN9__gnu_cxx11__promote_2IT_T0_NS0_9__promoteIS2_XsrSt12__is_integerIS2_E7__valueEE6__typeENS4_IS3_XsrS5_IS3_E7__valueEE6__typeEE6__typeES2_S3_
        .weak_definition __ZSt3powIdiEN9__gnu_cxx11__promote_2IT_T0_NS0_9__promoteIS2_XsrSt12__is_integerIS2_E7__valueEE6__typeENS4_IS3_XsrS5_IS3_E7__valueEE6__typeEE6__typeES2_S3_
    __ZSt3powIdiEN9__gnu_cxx11__promote_2IT_T0_NS0_9__promoteIS2_XsrSt12__is_integerIS2_E7__valueEE6__typeENS4_IS3_XsrS5_IS3_E7__valueEE6__typeEE6__typeES2_S3_:
    LFB238:
        pushq   %rbp
    LCFI3:
        movq    %rsp, %rbp
    LCFI4:
        subq    $16, %rsp
        movsd   %xmm0, -8(%rbp)
        movl    %edi, -12(%rbp)
        cvtsi2sd    -12(%rbp), %xmm2
        movd    %xmm2, %rax
        movq    -8(%rbp), %rdx
        movd    %rax, %xmm1
        movd    %rdx, %xmm0
        call    _pow
        movd    %xmm0, %rax
        movd    %rax, %xmm0
        leave
    LCFI5:
        ret
    LFE238:
        .text
        .globl __Z14square_libraryd
    __Z14square_libraryd:
    LFB237:
        pushq   %rbp
    LCFI6:
        movq    %rsp, %rbp
    LCFI7:
        subq    $16, %rsp
        movsd   %xmm0, -8(%rbp)
    # 9 "square.cpp" 1
        ### Square Library
    # 0 "" 2
        movq    -8(%rbp), %rax
        movl    $2, %edi
        movd    %rax, %xmm0
        call    __ZSt3powIdiEN9__gnu_cxx11__promote_2IT_T0_NS0_9__promoteIS2_XsrSt12__is_integerIS2_E7__valueEE6__typeENS4_IS3_XsrS5_IS3_E7__valueEE6__typeEE6__typeES2_S3_
        movd    %xmm0, %rax
        movd    %rax, %xmm0
        leave
    LCFI8:
        ret
    

    But when you produce the optimized code (even at the lowest optimization level for GCC, meaning -O1) the code is just identical:

    # 4 "square.cpp" 1
        ### Square Explicit
    # 0 "" 2
        mulsd   %xmm0, %xmm0
        ret
    LFE236:
        .globl __Z14square_libraryd
    __Z14square_libraryd:
    LFB237:
    # 9 "square.cpp" 1
        ### Square Library
    # 0 "" 2
        mulsd   %xmm0, %xmm0
        ret
    

    So, it really makes no difference unless you care about the speed of unoptimized code.

    Like I said: it seems to me that std::pow(x, 2) more clearly conveys your intentions, but that is a matter of preference, not performance.

    And the optimization seems to hold even for more complex expressions. Take, for instance:

    double explicit_harder(double x) {
      asm("### Explicit, harder");
      return x * x - std::sin(x) * std::sin(x) / (1 - std::tan(x) * std::tan(x));
    }
    
    double implicit_harder(double x) {
      asm("### Library, harder");
      return std::pow(x, 2) - std::pow(std::sin(x), 2) / (1 - std::pow(std::tan(x), 2));
    }
    

    Again, with -O1 (the lowest optimization), the assembly is identical yet again:

    # 14 "square.cpp" 1
        ### Explicit, harder
    # 0 "" 2
        call    _sin
        movd    %xmm0, %rbp
        movd    %rbx, %xmm0
        call    _tan
        movd    %rbx, %xmm3
        mulsd   %xmm3, %xmm3
        movd    %rbp, %xmm1
        mulsd   %xmm1, %xmm1
        mulsd   %xmm0, %xmm0
        movsd   LC0(%rip), %xmm2
        subsd   %xmm0, %xmm2
        divsd   %xmm2, %xmm1
        subsd   %xmm1, %xmm3
        movapd  %xmm3, %xmm0
        addq    $8, %rsp
    LCFI3:
        popq    %rbx
    LCFI4:
        popq    %rbp
    LCFI5:
        ret
    LFE239:
        .globl __Z15implicit_harderd
    __Z15implicit_harderd:
    LFB240:
        pushq   %rbp
    LCFI6:
        pushq   %rbx
    LCFI7:
        subq    $8, %rsp
    LCFI8:
        movd    %xmm0, %rbx
    # 19 "square.cpp" 1
        ### Library, harder
    # 0 "" 2
        call    _sin
        movd    %xmm0, %rbp
        movd    %rbx, %xmm0
        call    _tan
        movd    %rbx, %xmm3
        mulsd   %xmm3, %xmm3
        movd    %rbp, %xmm1
        mulsd   %xmm1, %xmm1
        mulsd   %xmm0, %xmm0
        movsd   LC0(%rip), %xmm2
        subsd   %xmm0, %xmm2
        divsd   %xmm2, %xmm1
        subsd   %xmm1, %xmm3
        movapd  %xmm3, %xmm0
        addq    $8, %rsp
    LCFI9:
        popq    %rbx
    LCFI10:
        popq    %rbp
    LCFI11:
        ret
    

    Finally: the x * x approach does not require includeing cmath which would make your compilation ever so slightly faster all else being equal.

    0 讨论(0)
  • 2020-12-03 04:58

    This question touches on one of the key weaknesses of most implementations of C and C++ regarding scientific programming. After having switched from Fortran to C about twenty years, and later to C++, this remains one of those sore spots that occasionally makes me wonder whether that switch was a good thing to do.

    The problem in a nutshell:

    • The easiest way to implement pow is Type pow(Type x; Type y) {return exp(y*log(x));}
    • Most C and C++ compilers take the easy way out.
    • Some might 'do the right thing', but only at high optimization levels.
    • Compared to x*x, the easy way out with pow(x,2) is extremely expensive computationally and loses precision.

    Compare to languages aimed at scientific programming:

    • You don't write pow(x,y). These languages have a built-in exponentiation operator. That C and C++ have steadfastly refused to implement an exponentiation operator makes the blood of many scientific programmers programmers boil. To some diehard Fortran programmers, this alone is reason to never switch to C.
    • Fortran (and other languages) are required to 'do the right thing' for all small integer powers, where small is any integer between -12 and 12. (The compiler is non-compliant if it can't 'do the right thing'.) Moreover, they are required to do so with optimization off.
    • Many Fortran compilers also know how to extract some rational roots without resorting to the easy way out.

    There is an issue with relying on high optimization levels to 'do the right thing'. I have worked for multiple organizations that have banned use of optimization in safety critical software. Memories can be very long (multiple decades long) after losing 10 million dollars here, 100 million there, all due to bugs in some optimizing compiler.

    IMHO, one should never use pow(x,2) in C or C++. I'm not alone in this opinion. Programmers who do use pow(x,2) typically get reamed big time during code reviews.

    0 讨论(0)
  • 2020-12-03 05:02

    Not only is x*x clearer it certainly will be at least as fast as pow(x,2).

    0 讨论(0)
提交回复
热议问题