How to determine how many bytes an integer needs?

前端未结

关注

 22  1303

I\'m looking for the most efficient way to calculate the minimum number of bytes needed to store an integer without losing precision.

e.g.

int: 10 = 1 byte


                      
              相关标签:


      
      
        
          22条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  独厮守ぢ        
                
              
                            
                2020-12-13 02:54
              
            
            
                                                                       
You need just two simple ifs if you are interested on the common sizes only. Consider this (assuming that you actually have unsigned values):

if (val < 0x10000) {
    if (val < 0x100) // 8 bit
    else // 16 bit
} else {
    if (val < 0x100000000L) // 32 bit
    else // 64 bit
}


Should you need to test for other sizes, choosing a middle point and then doing nested tests will keep the number of tests very low in any case. However, in that case making the testing a recursive function might be a better option, to keep the code simple. A decent compiler will optimize away the recursive calls so that the resulting code is still just as fast.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野性不改        
                
              
                            
                2020-12-13 02:56
              
            
            
                                                                       
This code has 0 branches, which could be faster on some systems.  Also on some systems (GPGPU) its important for threads in the same warp to execute the same instructions.  This code is always the same number of instructions no matter what the input value.

inline int get_num_bytes(unsigned long long value) // where unsigned long long is the largest integer value on this platform
{
    int size = 1; // starts at 1 sot that 0 will return 1 byte

    size += !!(value & 0xFF00);
    size += !!(value & 0xFFFF0000);
    if (sizeof(unsigned long long) > 4) // every sane compiler will optimize this out
    {
        size += !!(value & 0xFFFFFFFF00000000ull);
        if (sizeof(unsigned long long) > 8)
        {
            size += !!(value & 0xFFFFFFFFFFFFFFFF0000000000000000ull);
        }
    }

    static const int size_table[] = { 1, 2, 4, 8, 16 };
    return size_table[size];
}


g++ -O3 produces the following (verifying that the ifs are optimized out):

xor    %edx,%edx
test   $0xff00,%edi
setne  %dl
xor    %eax,%eax
test   $0xffff0000,%edi
setne  %al
lea    0x1(%rdx,%rax,1),%eax
movabs $0xffffffff00000000,%rdx
test   %rdx,%rdi
setne  %dl
lea    (%rdx,%rax,1),%rax
and    $0xf,%eax
mov    _ZZ13get_num_bytesyE10size_table(,%rax,4),%eax
retq

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2020-12-13 02:57
              
            
            
                                                                       
You need exactly the log function

nb_bytes = floor(log(x)/log(256))+1
if you use log2, log2(256) == 8 so

floor(log2(x)/8)+1
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人共我        
                
              
                            
                2020-12-13 02:58
              
            
            
                                                                       
Use this:

int n = 0;
while (x != 0) {
    x >>= 8;
    n ++;
}


This assumes that x contains your (positive) value.

Note that zero will be declared encodable as no byte at all. Also, most variable-size encodings need some length field or terminator to know where encoding stops in a file or stream (usually, when you encode an integer and mind about size, then there is more than one integer in your encoded object).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2020-12-13 03:02
              
            
            
                                                                       
You may first get the highest bit set, which is the same as log2(N), and then get the bytes needed by ceil(log2(N) / 8).

Here are some bit hacks for getting the position of the highest bit set, which are copied from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious, and you can click the URL for details of how these algorithms work.

Find the integer log base 2 of an integer with an 64-bit IEEE float

int v; // 32-bit integer to find the log base 2 of
int r; // result of log_2(v) goes here
union { unsigned int u[2]; double d; } t; // temp

t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] = 0x43300000;
t.u[__FLOAT_WORD_ORDER!=LITTLE_ENDIAN] = v;
t.d -= 4503599627370496.0;
r = (t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] >> 20) - 0x3FF;


Find the log base 2 of an integer with a lookup table

static const char LogTable256[256] = 
{
#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
    -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
    LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
    LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
};

unsigned int v; // 32-bit word to find the log of
unsigned r;     // r will be lg(v)
register unsigned int t, tt; // temporaries

if (tt = v >> 16)
{
  r = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
}
else 
{
  r = (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
}


Find the log base 2 of an N-bit integer in O(lg(N)) operations

unsigned int v;  // 32-bit value to find the log2 of 
const unsigned int b[] = {0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000};
const unsigned int S[] = {1, 2, 4, 8, 16};
int i;

register unsigned int r = 0; // result of log2(v) will go here
for (i = 4; i >= 0; i--) // unroll for speed...
{
  if (v & b[i])
  {
    v >>= S[i];
    r |= S[i];
  } 
}


// OR (IF YOUR CPU BRANCHES SLOWLY):

unsigned int v;          // 32-bit value to find the log2 of 
register unsigned int r; // result of log2(v) will go here
register unsigned int shift;

r =     (v > 0xFFFF) << 4; v >>= r;
shift = (v > 0xFF  ) << 3; v >>= shift; r |= shift;
shift = (v > 0xF   ) << 2; v >>= shift; r |= shift;
shift = (v > 0x3   ) << 1; v >>= shift; r |= shift;
                                        r |= (v >> 1);


// OR (IF YOU KNOW v IS A POWER OF 2):

unsigned int v;  // 32-bit value to find the log2 of 
static const unsigned int b[] = {0xAAAAAAAA, 0xCCCCCCCC, 0xF0F0F0F0, 
                                 0xFF00FF00, 0xFFFF0000};
register unsigned int r = (v & b[0]) != 0;
for (i = 4; i > 0; i--) // unroll for speed...
{
  r |= ((v & b[i]) != 0) << i;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-12-13 03:03
              
            
            
                                                                       
For each of eight times, shift the int eight bits to the right and see if there are still 1-bits left. The number of times you shift before you stop is the number of bytes you need.

More succinctly, the minimum number of bytes you need is ceil(min_bits/8), where min_bits is the index (i+1) of the highest set bit.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
4
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复