Position of least significant bit that is set

后端 未结 23 992
时光取名叫无心
时光取名叫无心 2020-11-22 08:46

I am looking for an efficient way to determine the position of the least significant bit that is set in an integer, e.g. for 0x0FF0 it would be 4.

A trivial impleme

相关标签:
23条回答
  • 2020-11-22 09:04

    Here is one simple alternative, even though finding logs is a bit costly.

    if(n == 0)
      return 0;
    return log2(n & -n)+1;   //Assuming the bit index starts from 1
    
    0 讨论(0)
  • 2020-11-22 09:06

    Inspired by this similar post that involves searching for a set bit, I offer the following:

    unsigned GetLowestBitPos(unsigned value)
    {
       double d = value ^ (value - !!value); 
       return (((int*)&d)[1]>>20)-1023; 
    }
    

    Pros:

    • no loops
    • no branching
    • runs in constant time
    • handles value=0 by returning an otherwise-out-of-bounds result
    • only two lines of code

    Cons:

    • assumes little endianness as coded (can be fixed by changing the constants)
    • assumes that double is a real*8 IEEE float (IEEE 754)

    Update: As pointed out in the comments, a union is a cleaner implementation (for C, at least) and would look like:

    unsigned GetLowestBitPos(unsigned value)
    {
        union {
            int i[2];
            double d;
        } temp = { .d = value ^ (value - !!value) };
        return (temp.i[1] >> 20) - 1023;
    }
    

    This assumes 32-bit ints with little-endian storage for everything (think x86 processors).

    0 讨论(0)
  • 2020-11-22 09:08

    This is in regards of @Anton Tykhyy answer

    Here is my C++11 constexpr implementation doing away with casts and removing a warning on VC++17 by truncating a 64bit result to 32 bits:

    constexpr uint32_t DeBruijnSequence[32] =
    {
        0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
        31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
    };
    constexpr uint32_t ffs ( uint32_t value )
    {
        return  DeBruijnSequence[ 
            (( ( value & ( -static_cast<int32_t>(value) ) ) * 0x077CB531ULL ) & 0xFFFFFFFF)
                >> 27];
    }
    

    To get around the issue of 0x1 and 0x0 both returning 0 you can do:

    constexpr uint32_t ffs ( uint32_t value )
    {
        return (!value) ? 32 : DeBruijnSequence[ 
            (( ( value & ( -static_cast<int32_t>(value) ) ) * 0x077CB531ULL ) & 0xFFFFFFFF)
                >> 27];
    }
    

    but if the compiler can't or won't preprocess the call it will add a couple of cycles to the calculation.

    Finally, if interested, here's a list of static asserts to check that the code does what is intended to:

    static_assert (ffs(0x1) == 0, "Find First Bit Set Failure.");
    static_assert (ffs(0x2) == 1, "Find First Bit Set Failure.");
    static_assert (ffs(0x4) == 2, "Find First Bit Set Failure.");
    static_assert (ffs(0x8) == 3, "Find First Bit Set Failure.");
    static_assert (ffs(0x10) == 4, "Find First Bit Set Failure.");
    static_assert (ffs(0x20) == 5, "Find First Bit Set Failure.");
    static_assert (ffs(0x40) == 6, "Find First Bit Set Failure.");
    static_assert (ffs(0x80) == 7, "Find First Bit Set Failure.");
    static_assert (ffs(0x100) == 8, "Find First Bit Set Failure.");
    static_assert (ffs(0x200) == 9, "Find First Bit Set Failure.");
    static_assert (ffs(0x400) == 10, "Find First Bit Set Failure.");
    static_assert (ffs(0x800) == 11, "Find First Bit Set Failure.");
    static_assert (ffs(0x1000) == 12, "Find First Bit Set Failure.");
    static_assert (ffs(0x2000) == 13, "Find First Bit Set Failure.");
    static_assert (ffs(0x4000) == 14, "Find First Bit Set Failure.");
    static_assert (ffs(0x8000) == 15, "Find First Bit Set Failure.");
    static_assert (ffs(0x10000) == 16, "Find First Bit Set Failure.");
    static_assert (ffs(0x20000) == 17, "Find First Bit Set Failure.");
    static_assert (ffs(0x40000) == 18, "Find First Bit Set Failure.");
    static_assert (ffs(0x80000) == 19, "Find First Bit Set Failure.");
    static_assert (ffs(0x100000) == 20, "Find First Bit Set Failure.");
    static_assert (ffs(0x200000) == 21, "Find First Bit Set Failure.");
    static_assert (ffs(0x400000) == 22, "Find First Bit Set Failure.");
    static_assert (ffs(0x800000) == 23, "Find First Bit Set Failure.");
    static_assert (ffs(0x1000000) == 24, "Find First Bit Set Failure.");
    static_assert (ffs(0x2000000) == 25, "Find First Bit Set Failure.");
    static_assert (ffs(0x4000000) == 26, "Find First Bit Set Failure.");
    static_assert (ffs(0x8000000) == 27, "Find First Bit Set Failure.");
    static_assert (ffs(0x10000000) == 28, "Find First Bit Set Failure.");
    static_assert (ffs(0x20000000) == 29, "Find First Bit Set Failure.");
    static_assert (ffs(0x40000000) == 30, "Find First Bit Set Failure.");
    static_assert (ffs(0x80000000) == 31, "Find First Bit Set Failure.");
    
    0 讨论(0)
  • 2020-11-22 09:09

    Why not use binary search? This will always complete after 5 operations (assuming int size of 4 bytes):

    if (0x0000FFFF & value) {
        if (0x000000FF & value) {
            if (0x0000000F & value) {
                if (0x00000003 & value) {
                    if (0x00000001 & value) {
                        return 1;
                    } else {
                        return 2;
                    }
                } else {
                    if (0x0000004 & value) {
                        return 3;
                    } else {
                        return 4;
                    }
                }
            } else { ...
        } else { ...
    } else { ...
    
    0 讨论(0)
  • 2020-11-22 09:09

    According to the Chess Programming BitScan page and my own measurements, subtract and xor is faster than negate and mask.

    (Note than if you are going to count the trailing zeros in 0, the method as I have it returns 63 whereas the negate and mask returns 0.)

    Here is a 64-bit subtract and xor:

    unsigned long v;  // find the number of trailing zeros in 64-bit v 
    int r;            // result goes here
    static const int MultiplyDeBruijnBitPosition[64] = 
    {
      0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61,
      54, 58, 35, 52, 50, 42, 21, 44, 38, 32, 29, 23, 17, 11, 4, 62,
      46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 31, 22, 10, 45,
      25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63
    };
    r = MultiplyDeBruijnBitPosition[((uint32_t)((v ^ (v-1)) * 0x03F79D71B4CB0A89U)) >> 58];
    

    For reference, here is a 64-bit version of the negate and mask method:

    unsigned long v;  // find the number of trailing zeros in 64-bit v 
    int r;            // result goes here
    static const int MultiplyDeBruijnBitPosition[64] = 
    {
      0, 1, 48, 2, 57, 49, 28, 3, 61, 58, 50, 42, 38, 29, 17, 4,
      62, 55, 59, 36, 53, 51, 43, 22, 45, 39, 33, 30, 24, 18, 12, 5,
      63, 47, 56, 27, 60, 41, 37, 16, 54, 35, 52, 21, 44, 32, 23, 11,
      46, 26, 40, 15, 34, 20, 31, 10, 25, 14, 19, 9, 13, 8, 7, 6
    };
    r = MultiplyDeBruijnBitPosition[((uint32_t)((v & -v) * 0x03F79D71B4CB0A89U)) >> 58];
    
    0 讨论(0)
  • 2020-11-22 09:09

    If you have the resources, you can sacrifice memory in order to improve the speed:

    static const unsigned bitPositions[MAX_INT] = { 0, 0, 1, 0, 2, /* ... */ };
    
    unsigned GetLowestBitPos(unsigned value)
    {
        assert(value != 0); // handled separately
        return bitPositions[value];
    }
    

    Note: This table would consume at least 4 GB (16 GB if we leave the return type as unsigned). This is an example of trading one limited resource (RAM) for another (execution speed).

    If your function needs to remain portable and run as fast as possible at any cost, this would be the way to go. In most real-world applications, a 4GB table is unrealistic.

    0 讨论(0)
提交回复
热议问题