calculate number of bits set in byte

前端未结

关注

 11  2054

I am interested, which is the optimal way of calculating the number of bits set in byte by this way

template< unsigned char byte > class BITS_SET
{
pub


                      
              相关标签:


      
      
        
          11条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  生来不讨喜        
                
              
                            
                2020-12-09 19:20
              
            
            
                                                                       
Why not do a left shift and mask off the rest?

int countBits(unsigned char byte){
    int count = 0;
    for(int i = 0; i < 8; i++)
        count += (byte >> i) & 0x01; // Shift bit[i] to the first position, and mask off the remaining bits.
    return count;
}


This can easily be adapted to handle ints of any size by simply calculating how many bits there is in the value being counted, then use that value in the counter loop. This is all very trivial to do.

int countBits(unsigned long long int a){
    int count = 0;
    for(int i = 0; i < sizeof(a)*8; i++)
        count += (a >> i) & 0x01;
    return count;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2020-12-09 19:27
              
            
            
                                                                       
The usual answer for "fastest way to do bitcount" is "look up the byte in an array".  That kind of works for bytes, but you pay an actual memory access for it.
If you only do this once in awhile, it is likely the fastest, but then you don't need the fastest if you only do it once in awhile.

If you do it a lot, you are better off batching up bytes into words or doublewords, and doing fast bitcount operations on these.  These tend to be pure arithmetic, since you can't realistically lookup a 32 bit value in an array to get its bitcount.  Instead you combine values by shifting and masking in clever ways.

A great source of clever tricks for doing this is Bit Hacks.

Here is the scheme published there for counting bits in 32 bit words in C:

 unsigned int v; // count bits set in this (32-bit value)
 unsigned int c; // store the total here

 v = v - ((v >> 1) & 0x55555555);                    // reuse input as temporary
 v = (v & 0x33333333) + ((v >> 2) & 0x33333333);     // temp
 c = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; // count

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-09 19:29
              
            
            
                                                                       
For 8-bit values, just use a 256-element lookup table.

For larger sized inputs, it's slightly less trivial.  Sean Eron Anderson has several different functions for this on his Bit Twiddling Hacks page, all with different performance characteristics.  There is not one be-all-end-all-fastest version, since it depends on the nature of your processor (pipeline depth, branch predictor, cache size, etc.) and the data you're using.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  灰色年华        
                
              
                            
                2020-12-09 19:30
              
            
            
                                                                       
In gcc you can use __builtin_popcount(unsigned) function.

It should efficiently use the optimal solution for the target hardware platform.

With -march=core-avx2 (highest level compatible with my cpu) the popcntl x86_64 assembly instruction was used, doing it in the hardware.

With the default x86_64 instruction set a popcntl function was called that implements the optimal C (clever hacks) algorithm.

There's also __builtin_popcountl and __builtin_popcountll for unsigned long and unsigned long long.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2020-12-09 19:31
              
            
            
                                                                       
Using C++17 you can precalculate the lookup table using a constexpr lambda. Easier to reason about the correctness of it rather than a ready copy-pasted table.
#include <array>

static constexpr auto bitsPerByteTable = [] {
  std::array<uint8_t, 256> table{};
  for (decltype(table)::size_type i = 0; i < table.size(); i++) {
    table.at(i) = table.at(i / 2) + (i & 1);
  }
  return table;
}();

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-09 19:33
              
            
            
                                                                       
For one byte of data, the optimal way considering both speed and memory consumption:

uint8_t count_ones (uint8_t byte)
{
  static const uint8_t NIBBLE_LOOKUP [16] =
  {
    0, 1, 1, 2, 1, 2, 2, 3, 
    1, 2, 2, 3, 2, 3, 3, 4
  };


  return NIBBLE_LOOKUP[byte & 0x0F] + NIBBLE_LOOKUP[byte >> 4];
}


Calling this function from a for loop should yield quite an efficient program on most systems. And it is very generic.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复