What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?

后端 未结 27 2771
终归单人心
终归单人心 2020-11-22 03:35

If I have some integer n, and I want to know the position of the most significant bit (that is, if the least significant bit is on the right, I want to know the position of

27条回答
  •  感情败类
    2020-11-22 04:14

    Some overly complex answers here. The Debruin technique should only be used when the input is already a power of two, otherwise there's a better way. For a power of 2 input, Debruin is the absolute fastest, even faster than _BitScanReverse on any processor I've tested. However, in the general case, _BitScanReverse (or whatever the intrinsic is called in your compiler) is the fastest (on certain CPU's it can be microcoded though).

    If the intrinsic function is not an option, here is an optimal software solution for processing general inputs.

    u8  inline log2 (u32 val)  {
        u8  k = 0;
        if (val > 0x0000FFFFu) { val >>= 16; k  = 16; }
        if (val > 0x000000FFu) { val >>= 8;  k |= 8;  }
        if (val > 0x0000000Fu) { val >>= 4;  k |= 4;  }
        if (val > 0x00000003u) { val >>= 2;  k |= 2;  }
        k |= (val & 2) >> 1;
        return k;
    }
    

    Note that this version does not require a Debruin lookup at the end, unlike most of the other answers. It computes the position in place.

    Tables can be preferable though, if you call it repeatedly enough times, the risk of a cache miss becomes eclipsed by the speedup of a table.

    u8 kTableLog2[256] = {
    0,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
    5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
    6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
    6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7
    };
    
    u8 log2_table(u32 val)  {
        u8  k = 0;
        if (val > 0x0000FFFFuL) { val >>= 16; k  = 16; }
        if (val > 0x000000FFuL) { val >>=  8; k |=  8; }
        k |= kTableLog2[val]; // precompute the Log2 of the low byte
    
        return k;
    }
    

    This should produce the highest throughput of any of the software answers given here, but if you only call it occasionally, prefer a table-free solution like my first snippet.

提交回复
热议问题