What is the best way to construct a bit mask in C with m
set bits preceded by k
unset bits, and followed by n
unset bits:
So, you are asking for m set bits prefixed by k reset bits and followed by n reset bits? We can ignore k since it will largely be constrained by the choice of integer type.
mask = ((1 << m) - 1) << n;
~(~0 << m) << n
I like both solutions. Here is another way that comes to my mind (probably not better).
((~((unsigned int)0) << k) >> (k + n)) << n
EDIT:
There was a bug in my previous version (it was without the unsigned int cast). The problem was that ~0 >> n
adds 1s at the front and not 0s.
And yes this approach has one big downside; it assumes that you know the number of bits of the default integer type or in other words it assumes that you really know k, whereas the other solutions are independent of k. This makes my version less portable, or at least harder to port. (It also uses 3 shifts, and addition and a bitwise negation operator, which is two extra operations.)
So you would do better to use one of the other examples.
Here is a little test app, done by Jonathan Leffler, to compare and verify the output of the different solutions:
#include <stdio.h>
#include <limits.h>
enum { ULONG_BITS = (sizeof(unsigned long) * CHAR_BIT) };
static unsigned long set_mask_1(int k, int m, int n)
{
return ~(~0 << m) << n;
}
static unsigned long set_mask_2(int k, int m, int n)
{
return ((1 << m) - 1) << n;
}
static unsigned long set_mask_3(int k, int m, int n)
{
return ((~((unsigned long)0) << k) >> (k + n)) << n;
}
static int test_cases[][2] =
{
{ 1, 0 },
{ 1, 1 },
{ 1, 2 },
{ 1, 3 },
{ 2, 1 },
{ 2, 2 },
{ 2, 3 },
{ 3, 4 },
{ 3, 5 },
};
int main(void)
{
size_t i;
for (i = 0; i < 9; i++)
{
int m = test_cases[i][0];
int n = test_cases[i][1];
int k = ULONG_BITS - (m + n);
printf("%d/%d/%d = 0x%08lX = 0x%08lX = 0x%08lX\n", k, m, n,
set_mask_1(k, m, n),
set_mask_2(k, m, n),
set_mask_3(k, m, n));
}
return 0;
}
(Only) For those who are interested in a slightly more efficient solution on x86 systems with BMI2 support (Intel Haswell or newer, AMD Excavator or newer):
mask = _bzhi_u32(-1,m)<<n;
The bzhi
instruction zeros the high bits starting with specified bit position.
The _bzhi_u32
intrinsic compiles to this instruction. Test code:
#include <stdio.h>
#include <x86intrin.h>
/* gcc -O3 -Wall -m64 -march=haswell bitmsk_mn.c */
unsigned int bitmsk(unsigned int m, unsigned int n)
{
return _bzhi_u32(-1,m)<<n;
}
int main() {
int k = bitmsk(7,13);
printf("k= %08X\n",k);
return 0;
}
Output:
$./a.out
k= 000FE000
The code fragment _bzhi_u32(-1,m)<<n
compiles to three instructions
movl $-1, %edx
bzhi %edi, %edx, %edi
shlx %esi, %edi, %eax
Which is one instruction less than the codes by @Jonathan Leffler
and @Darius Bacon.
On Intel Haswell processors or newer, both bzhi
and shlx
have a latency of 1 cycle and a
throughput of 2 per cycle. On AMD Ryzen these two instructions even have a throughput of 4 per cycle.
Whilst the top answers are simple and effective they don't set the MSB for the case when n=0
and m=31
:
~(~0 << 31) << 0
= 0111 1111 1111 1111 1111 1111 1111 1111
((1 << 31)-1) << 0
= 0111 1111 1111 1111 1111 1111 1111 1111
My suggestion for a 32-bit unsigned word (which is ugly and has a branch) looks like this:
unsigned int create_mask(unsigned int n,unsigned int m) {
// 0 <= start_bit, end_bit <= 31
return (m - n == 31 ? 0xFFFFFFFF : ((1 << (m-n)+1)-1) << n);
}
This actually gets the bits in the range [m,n]
(closed interval) so create_mask(0,0)
will return a mask for the first bit (bit 0) and create_mask(4,6)
returns a mask for bits 4 to 6 i.e ... 00111 0000
.