What is the best way to construct a bit mask in C with m
set bits preceded by k
unset bits, and followed by n
unset bits:
(Only) For those who are interested in a slightly more efficient solution on x86 systems with BMI2 support (Intel Haswell or newer, AMD Excavator or newer):
mask = _bzhi_u32(-1,m)<
The bzhi
instruction zeros the high bits starting with specified bit position.
The _bzhi_u32
intrinsic compiles to this instruction. Test code:
#include
#include
/* gcc -O3 -Wall -m64 -march=haswell bitmsk_mn.c */
unsigned int bitmsk(unsigned int m, unsigned int n)
{
return _bzhi_u32(-1,m)<
Output:
$./a.out
k= 000FE000
The code fragment _bzhi_u32(-1,m)<
movl $-1, %edx
bzhi %edi, %edx, %edi
shlx %esi, %edi, %eax
Which is one instruction less than the codes by @Jonathan Leffler
and @Darius Bacon.
On Intel Haswell processors or newer, both bzhi
and shlx
have a latency of 1 cycle and a
throughput of 2 per cycle. On AMD Ryzen these two instructions even have a throughput of 4 per cycle.