Efficient Algorithm for Bit Reversal (from MSB->LSB to LSB->MSB) in C

后端未结

关注

 26  1374

情深已故

What is the most efficient algorithm to achieve the following:

0010 0000 => 0000 0100

The conversion is from MSB->LSB to LSB->MSB. All bits

相关标签:

26条回答

名媛妹妹

2020-11-22 06:59
I thought this is one of the simplest way to reverse the bit. please let me know if there is any flaw in this logic. basically in this logic, we check the value of the bit in position. set the bit if value is 1 on reversed position.
```
void bit_reverse(ui32 *data)
{
  ui32 temp = 0;    
  ui32 i, bit_len;    
  {    
   for(i = 0, bit_len = 31; i <= bit_len; i++)   
   {    
    temp |= (*data & 1 << i)? (1 << bit_len-i) : 0;    
   }    
   *data = temp;    
  }    
  return;    
}    
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-11-22 07:00

Native ARM instruction "rbit" can do it with 1 cpu cycle and 1 extra cpu register, impossible to beat.

0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2020-11-22 07:00
I know it isn't C but asm:
```
var1 dw 0f0f0
clc
     push ax
     push cx
     mov cx 16
loop1:
     shl var1
     shr ax
loop loop1
     pop ax
     pop cx
```
This works with the carry bit, so you may save flags too
0 讨论(0)
发布评论:

提交评论
- 加载中...

感情败类

2020-11-22 07:00

I think the simplest method I know follows. MSB is input and LSB is 'reversed' output:

unsigned char rev(char MSB) {
    unsigned char LSB=0;  // for output
    _FOR(i,0,8) {
        LSB= LSB << 1;
        if(MSB&1) LSB = LSB | 1;
        MSB= MSB >> 1;
    }
    return LSB;
}

//    It works by rotating bytes in opposite directions. 
//    Just repeat for each byte.

0 讨论(0)

-上瘾入骨i

2020-11-22 07:00

It seems that many other posts are concerned about speed (i.e best = fastest). What about simplicity? Consider:

char ReverseBits(char character) {
    char reversed_character = 0;
    for (int i = 0; i < 8; i++) {
        char ith_bit = (c >> i) & 1;
        reversed_character |= (ith_bit << (sizeof(char) - 1 - i));
    }
    return reversed_character;
}

and hope that clever compiler will optimise for you.

If you want to reverse a longer list of bits (containing sizeof(char) * n bits), you can use this function to get:

void ReverseNumber(char* number, int bit_count_in_number) {
    int bytes_occupied = bit_count_in_number / sizeof(char);      

    // first reverse bytes
    for (int i = 0; i <= (bytes_occupied / 2); i++) {
        swap(long_number[i], long_number[n - i]);
    }

    // then reverse bits of each individual byte
    for (int i = 0; i < bytes_occupied; i++) {
         long_number[i] = ReverseBits(long_number[i]);
    }
}

This would reverse [10000000, 10101010] into [01010101, 00000001].

0 讨论(0)

隐瞒了意图╮

2020-11-22 07:03
Anders Cedronius's answer provides a great solution for people that have an x86 CPU with AVX2 support. For x86 platforms without AVX support or non-x86 platforms, either of the following implementations should work well.

The first code is a variant of the classic binary partitioning method, coded to maximize the use of the shift-plus-logic idiom useful on various ARM processors. In addition, it uses on-the-fly mask generation which could be beneficial for RISC processors that otherwise require multiple instructions to load each 32-bit mask value. Compilers for x86 platforms should use constant propagation to compute all masks at compile time rather than run time.
```
/* Classic binary partitioning algorithm */
inline uint32_t brev_classic (uint32_t a)
{
    uint32_t m;
    a = (a >> 16) | (a << 16);                            // swap halfwords
    m = 0x00ff00ff; a = ((a >> 8) & m) | ((a << 8) & ~m); // swap bytes
    m = m^(m << 4); a = ((a >> 4) & m) | ((a << 4) & ~m); // swap nibbles
    m = m^(m << 2); a = ((a >> 2) & m) | ((a << 2) & ~m);
    m = m^(m << 1); a = ((a >> 1) & m) | ((a << 1) & ~m);
    return a;
}
```
In volume 4A of "The Art of Computer Programming", D. Knuth shows clever ways of reversing bits that somewhat surprisingly require fewer operations than the classical binary partitioning algorithms. One such algorithm for 32-bit operands, that I cannot find in TAOCP, is shown in this document on the Hacker's Delight website.
```
/* Knuth's algorithm from http://www.hackersdelight.org/revisions.pdf. Retrieved 8/19/2015 */
inline uint32_t brev_knuth (uint32_t a)
{
    uint32_t t;
    a = (a << 15) | (a >> 17);
    t = (a ^ (a >> 10)) & 0x003f801f; 
    a = (t + (t << 10)) ^ a;
    t = (a ^ (a >>  4)) & 0x0e038421; 
    a = (t + (t <<  4)) ^ a;
    t = (a ^ (a >>  2)) & 0x22488842; 
    a = (t + (t <<  2)) ^ a;
    return a;
}
```
Using the Intel compiler C/C++ compiler 13.1.3.198, both of the above functions auto-vectorize nicely targetting XMM registers. They could also be vectorized manually without a lot of effort.

On my IvyBridge Xeon E3 1270v2, using the auto-vectorized code, 100 million uint32_t words were bit-reversed in 0.070 seconds using brev_classic(), and 0.068 seconds using brev_knuth(). I took care to ensure that my benchmark was not limited by system memory bandwidth.
0 讨论(0)
发布评论:

提交评论
- 加载中...