Fastest way to expand bits in a field to all (overlapping + adjacent) set bits in a mask?

前端未结

关注

 2  774

盖世英雄少女心 2021-02-19 20:47

Say I have 2 binary inputs named IN and MASK. Actual field size could be 32 to 256 bits depending on what instruction set is used to accomplish the task. Both inputs change ever

2条回答

梦如初夏 (楼主)

2021-02-19 21:15
The following approach needs only a single loop, with the number of iterations equal to the number of 'groups' found. I don't know if it will be more efficient than your approach; there's 6 arith/bitwise operations in each iteration.

In pseudo code (C-like):
```
OUT = 0;
a = MASK;
while (a)
{
    e = a & ~(a + (a & (-a)));
    if (e & IN) OUT |= e;
    a ^= e;
}
```
Here's how it works, step by step, using 11010111 as an example mask:
```
OUT = 0

a = MASK        11010111
c = a & (-a)    00000001   keeps rightmost one only
d = a + c       11011000   clears rightmost group (and set the bit to its immediate left)
e = a & ~d      00000111   keeps rightmost group only

if (e & IN) OUT |= e;      adds group to OUT

a = a ^ e       11010000   clears rightmost group, so we can proceed with the next group
c = a & (-a)    00010000
d = a + c       11100000
e = a & ~d      00010000

if (e & IN) OUT |= e;

a = a ^ e       11000000
c = a & (-a)    01000000
d = a + c       00000000   (ignoring carry when adding)
e = a & ~d      11000000

if (e & IN) OUT |= e;

a = a ^ e       00000000   done
```
As pointed out @PeterCordes, some operations could be optimized using x86 BMI1 instructions:
- c = a & (-a): blsi
- e = a & ~d: andn
This approach is good for processor architectures that do not support bitwise reversal. On architectures that do have a dedicated instruction to reverse the order of bits in an integer, wim's answer is more efficient.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...