Efficient Algorithm for Bit Reversal (from MSB->LSB to LSB->MSB) in C

后端 未结 26 1374
情深已故
情深已故 2020-11-22 06:08

What is the most efficient algorithm to achieve the following:

0010 0000 => 0000 0100

The conversion is from MSB->LSB to LSB->MSB. All bits

相关标签:
26条回答
  • 2020-11-22 06:59

    I thought this is one of the simplest way to reverse the bit. please let me know if there is any flaw in this logic. basically in this logic, we check the value of the bit in position. set the bit if value is 1 on reversed position.

    void bit_reverse(ui32 *data)
    {
      ui32 temp = 0;    
      ui32 i, bit_len;    
      {    
       for(i = 0, bit_len = 31; i <= bit_len; i++)   
       {    
        temp |= (*data & 1 << i)? (1 << bit_len-i) : 0;    
       }    
       *data = temp;    
      }    
      return;    
    }    
    
    0 讨论(0)
  • 2020-11-22 07:00

    Native ARM instruction "rbit" can do it with 1 cpu cycle and 1 extra cpu register, impossible to beat.

    0 讨论(0)
  • 2020-11-22 07:00

    I know it isn't C but asm:

    var1 dw 0f0f0
    clc
         push ax
         push cx
         mov cx 16
    loop1:
         shl var1
         shr ax
    loop loop1
         pop ax
         pop cx
    

    This works with the carry bit, so you may save flags too

    0 讨论(0)
  • 2020-11-22 07:00

    I think the simplest method I know follows. MSB is input and LSB is 'reversed' output:

    unsigned char rev(char MSB) {
        unsigned char LSB=0;  // for output
        _FOR(i,0,8) {
            LSB= LSB << 1;
            if(MSB&1) LSB = LSB | 1;
            MSB= MSB >> 1;
        }
        return LSB;
    }
    
    //    It works by rotating bytes in opposite directions. 
    //    Just repeat for each byte.
    
    0 讨论(0)
  • 2020-11-22 07:00

    It seems that many other posts are concerned about speed (i.e best = fastest). What about simplicity? Consider:

    char ReverseBits(char character) {
        char reversed_character = 0;
        for (int i = 0; i < 8; i++) {
            char ith_bit = (c >> i) & 1;
            reversed_character |= (ith_bit << (sizeof(char) - 1 - i));
        }
        return reversed_character;
    }
    

    and hope that clever compiler will optimise for you.

    If you want to reverse a longer list of bits (containing sizeof(char) * n bits), you can use this function to get:

    void ReverseNumber(char* number, int bit_count_in_number) {
        int bytes_occupied = bit_count_in_number / sizeof(char);      
    
        // first reverse bytes
        for (int i = 0; i <= (bytes_occupied / 2); i++) {
            swap(long_number[i], long_number[n - i]);
        }
    
        // then reverse bits of each individual byte
        for (int i = 0; i < bytes_occupied; i++) {
             long_number[i] = ReverseBits(long_number[i]);
        }
    }
    

    This would reverse [10000000, 10101010] into [01010101, 00000001].

    0 讨论(0)
  • 2020-11-22 07:03

    Anders Cedronius's answer provides a great solution for people that have an x86 CPU with AVX2 support. For x86 platforms without AVX support or non-x86 platforms, either of the following implementations should work well.

    The first code is a variant of the classic binary partitioning method, coded to maximize the use of the shift-plus-logic idiom useful on various ARM processors. In addition, it uses on-the-fly mask generation which could be beneficial for RISC processors that otherwise require multiple instructions to load each 32-bit mask value. Compilers for x86 platforms should use constant propagation to compute all masks at compile time rather than run time.

    /* Classic binary partitioning algorithm */
    inline uint32_t brev_classic (uint32_t a)
    {
        uint32_t m;
        a = (a >> 16) | (a << 16);                            // swap halfwords
        m = 0x00ff00ff; a = ((a >> 8) & m) | ((a << 8) & ~m); // swap bytes
        m = m^(m << 4); a = ((a >> 4) & m) | ((a << 4) & ~m); // swap nibbles
        m = m^(m << 2); a = ((a >> 2) & m) | ((a << 2) & ~m);
        m = m^(m << 1); a = ((a >> 1) & m) | ((a << 1) & ~m);
        return a;
    }
    

    In volume 4A of "The Art of Computer Programming", D. Knuth shows clever ways of reversing bits that somewhat surprisingly require fewer operations than the classical binary partitioning algorithms. One such algorithm for 32-bit operands, that I cannot find in TAOCP, is shown in this document on the Hacker's Delight website.

    /* Knuth's algorithm from http://www.hackersdelight.org/revisions.pdf. Retrieved 8/19/2015 */
    inline uint32_t brev_knuth (uint32_t a)
    {
        uint32_t t;
        a = (a << 15) | (a >> 17);
        t = (a ^ (a >> 10)) & 0x003f801f; 
        a = (t + (t << 10)) ^ a;
        t = (a ^ (a >>  4)) & 0x0e038421; 
        a = (t + (t <<  4)) ^ a;
        t = (a ^ (a >>  2)) & 0x22488842; 
        a = (t + (t <<  2)) ^ a;
        return a;
    }
    

    Using the Intel compiler C/C++ compiler 13.1.3.198, both of the above functions auto-vectorize nicely targetting XMM registers. They could also be vectorized manually without a lot of effort.

    On my IvyBridge Xeon E3 1270v2, using the auto-vectorized code, 100 million uint32_t words were bit-reversed in 0.070 seconds using brev_classic(), and 0.068 seconds using brev_knuth(). I took care to ensure that my benchmark was not limited by system memory bandwidth.

    0 讨论(0)
提交回复
热议问题