Does a c/c++ compiler optimize constant divisions by power-of-two value into shifts?

后端 未结 4 579
清歌不尽
清歌不尽 2020-11-30 05:49

Question says it all. Does anyone know if the following...

size_t div(size_t value) {
    const size_t x = 64;
    return value / x;
}

...i

相关标签:
4条回答
  • 2020-11-30 06:06

    Even with g++ -O0 (yes, -O0!), this happens. Your function compiles down to:

    _Z3divm:
    .LFB952:
            pushq   %rbp
    .LCFI0:
            movq    %rsp, %rbp
    .LCFI1:
            movq    %rdi, -24(%rbp)
            movq    $64, -8(%rbp)
            movq    -24(%rbp), %rax
            shrq    $6, %rax
            leave
            ret
    

    Note the shrq $6, which is a right shift by 6 places.

    With -O1, the unnecessary junk is removed:

    _Z3divm:
    .LFB1023:
            movq    %rdi, %rax
            shrq    $6, %rax
            ret
    

    Results on g++ 4.3.3, x64.

    0 讨论(0)
  • 2020-11-30 06:08

    Yes, compilers generate the most optimal code for such simplistic calculations. However, why you are insisting specifically on "shifts" is not clear to me. The optimal code for a given platform might easily turn out to be something different from a "shift".

    In general case the old and beaten-to-death idea that a "shift" is somehow the most optimal way to implement power-of-two multiplications and divisions has very little practical relevance on modern platforms. It is a good way to illustrate the concept of "optimization" to newbies, but no more than that.

    Your original example is not really representative, because it uses an unsigned type, which greatly simplifies the implementation of division operation. The "round towards zero" requirement of the C and C++ languages makes it impossible to do division with a mere shift if the operand is signed.

    0 讨论(0)
  • 2020-11-30 06:11

    Only when it can determine that the argument is positive. That's the case for your example, but ever since C99 specified round-towards-zero semantics for integer division, it has become harder to optimize division by powers of two into shifts, because they give different results for negative arguments.

    In reaction to Michael's comment below, here is one way the division r=x/p;of x by a known power of two p can indeed be translated by the compiler:

    if (x<0)
      x += p-1;
    r = x >> (log2 p);
    

    Since the OP was asking whether he should think about these things, one possible answer would be "only if you know the dividend's sign better than the compiler or know that it doesn't matter if the result is rounded towards 0 or -∞".

    0 讨论(0)
  • 2020-11-30 06:20

    Most compilers will go even further than reducing division by powers of 2 into shifts - they'll often convert integer division by a constant into a series of multiplication, shift, and addition instructions to get the result instead of using the CPU's built-in divide instruction (if there even is one).

    For example, MSVC converts division by 71 to the following:

    // volatile int y = x / 71;
    
    8b 0c 24        mov ecx, DWORD PTR _x$[esp+8] ; load x into ecx
    
    b8 49 b4 c2 e6  mov eax, -423447479 ; magic happens starting here...
    f7 e9           imul ecx            ; edx:eax = x * 0xe6c2b449
    
    03 d1           add edx, ecx        ; edx = x + edx
    
    c1 fa 06        sar edx, 6          ; edx >>= 6 (with sign fill)
    
    8b c2           mov eax, edx        ; eax = edx
    c1 e8 1f        shr eax, 31         ; eax >>= 31 (no sign fill)
    03 c2           add eax, edx        ; eax += edx
    
    89 04 24        mov DWORD PTR _y$[esp+8], eax
    

    So, you get a divide by 71 with a multiply, a couple shifts and a couple adds.

    For more details on what's going on, consult Henry Warren's "Hacker's Delight" book or the companion webpage:

    • http://www.hackersdelight.org/

    There's an online added chapter that provides some addition information about about division by constants using multiplication/shift/add with magic numbers, and a page with a little JavaScript program that'll calculate the magic numbers you need.

    The companion site for the book is well worth reading (as is the book) - particularly if you're interested in bit-level micro optimizations.

    Another article that I discovered just now that discusses this optimization: http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx

    0 讨论(0)
提交回复
热议问题