Fast bit permutation

青春壹個敷衍的年華 提交于 2021-02-19 06:51:38

问题


I need to store and apply permutations to 16-bit integers. The best solution I came up with is to store permutation as 64-bit integer where each 4 bits correspond to the new position of i-th bit, the application would look like:

int16 permute(int16 bits, int64 perm)
{
   int16 result = 0;
   for(int i = 0; i < 16; ++i)
      result |= ((bits >> i) & 1) * (1 << int( (perm >> (i*4))&0xf ));
   return result;
}

is there a faster way to do this? Thank you.


回答1:


There are alternatives.

Any permutation can be handled by a Beneš network, and encoded as the masks that are the inputs to the multiplexers to apply the shuffle. This can be done reasonably efficiently in software too (not great but OK), it's just a bunch of butterfly permutations. The masks are a bit tricky to compute, but probably faster to apply than moving every bit on its own, though that depends on how many bits you're dealing with and 16 is not a lot.

Some smaller categories of shuffles can be handled by simpler (faster) networks, which you can also find on that page.

Finally in practice, on modern x86 hardware, there is the highly versatile pshufb function which can apply a permutation (but may include dupes and zeroes) to 16 bytes in (typically) a single cycle. It is slightly awkward to distribute the bits over the bytes, but once you're there it only takes a pshufb to permute and a pmovmskb to compress it back down to 16 bits.



来源:https://stackoverflow.com/questions/43575633/fast-bit-permutation

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!