Bit mask in C

前端未结

关注

 5  1493

What is the best way to construct a bit mask in C with m set bits preceded by k unset bits, and followed by n unset bits:


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2021-01-30 09:53
              
            
            
                                                                       
So, you are asking for m set bits prefixed by k reset bits and followed by n reset bits?  We can ignore k since it will largely be constrained by the choice of integer type.

mask = ((1 << m) - 1) << n;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2021-01-30 09:56
              
            
            
                                                                       
~(~0 << m) << n
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2021-01-30 09:56
              
            
            
                                                                       
I like both solutions.  Here is another way that comes to my mind (probably not better).


((~((unsigned int)0) << k) >> (k + n)) << n


EDIT:
There was a bug in my previous version (it was without the unsigned int cast). The problem was that ~0 >> n adds 1s at the front and not 0s.

And yes this approach has one big downside; it assumes that you know the number of bits of the default integer type or in other words it assumes that you really know k, whereas the other solutions are independent of k. This makes my version less portable, or at least harder to port.  (It also uses 3 shifts, and addition and a bitwise negation operator, which is two extra operations.)

So you would do better to use one of the other examples.

Here is a little test app, done by Jonathan Leffler, to compare and verify the output of the different solutions:

#include <stdio.h>
#include <limits.h>

enum { ULONG_BITS = (sizeof(unsigned long) * CHAR_BIT) };

static unsigned long set_mask_1(int k, int m, int n)
{
    return ~(~0 << m) << n;
}

static unsigned long set_mask_2(int k, int m, int n)
{
    return ((1 << m) - 1) << n;
}

static unsigned long set_mask_3(int k, int m, int n)
{
    return ((~((unsigned long)0) << k) >> (k + n)) << n;
}

static int test_cases[][2] =
{
    { 1, 0 },
    { 1, 1 },
    { 1, 2 },
    { 1, 3 },
    { 2, 1 },
    { 2, 2 },
    { 2, 3 },
    { 3, 4 },
    { 3, 5 },
};

int main(void)
{
    size_t i;
    for (i = 0; i < 9; i++)
    {
        int m = test_cases[i][0];
        int n = test_cases[i][1];
        int k = ULONG_BITS - (m + n);
        printf("%d/%d/%d = 0x%08lX = 0x%08lX = 0x%08lX\n", k, m, n,
               set_mask_1(k, m, n),
               set_mask_2(k, m, n),
               set_mask_3(k, m, n));
    }
    return 0;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  生来不讨喜        
                
              
                            
                2021-01-30 10:04
              
            
            
                                                                       
(Only) For those who are interested in a slightly more efficient solution on x86 systems with BMI2 support (Intel Haswell or newer, AMD Excavator or newer):

mask = _bzhi_u32(-1,m)<<n;


The bzhi instruction zeros the high bits starting with specified bit position.
The _bzhi_u32 intrinsic compiles to this instruction. Test code:

#include <stdio.h>
#include <x86intrin.h>
/*  gcc -O3 -Wall -m64 -march=haswell bitmsk_mn.c   */

unsigned int bitmsk(unsigned int m, unsigned int n)
{
    return _bzhi_u32(-1,m)<<n;
}

int main() {
    int k = bitmsk(7,13);
    printf("k= %08X\n",k);
    return 0;
}


Output:

$./a.out
k= 000FE000


The code fragment _bzhi_u32(-1,m)<<n compiles to three instructions

movl    $-1, %edx
bzhi    %edi, %edx, %edi
shlx    %esi, %edi, %eax


Which is one instruction less than the codes by @Jonathan Leffler
and @Darius Bacon.
On Intel Haswell processors or newer, both bzhi and shlx have a latency of 1 cycle and a 
throughput of 2 per cycle. On AMD Ryzen these two instructions even have a throughput of 4 per cycle.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人共我        
                
              
                            
                2021-01-30 10:07
              
            
            
                                                                       
Whilst the top answers are simple and effective they don't set the MSB for the case when n=0 and m=31:

~(~0 << 31) << 0 =  ‭0111 1111 1111 1111 1111 1111 1111 1111‬

((1 << 31)-1) << 0 = ‭0111 1111 1111 1111 1111 1111 1111 1111‬

My suggestion for a 32-bit unsigned word (which is ugly and has a branch) looks like this:

unsigned int create_mask(unsigned int n,unsigned int m) {
  // 0 <= start_bit, end_bit <= 31
  return (m - n == 31 ? 0xFFFFFFFF : ((1 << (m-n)+1)-1) << n);
}


This actually gets the bits in the range [m,n] (closed interval) so create_mask(0,0) will return a mask for the first bit (bit 0) and create_mask(4,6) returns a mask for bits 4 to 6 i.e ... 00111 0000.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复