C++ fast division/mod by 10^x

后端未结

关注

 10  518

In my program I use a lot of integer division by 10^x and integer mod function of power 10.

For example:

unsigned __int64 a = 12345;
a = a / 100;
...


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  梦如初夏        
                
              
                            
                2020-12-03 05:20
              
            
            
                                                                       
You can also take a look at the libdivide project. It is designed to speed-up the integer division, in the general case.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清酒与你        
                
              
                            
                2020-12-03 05:21
              
            
            
                                                                       
If the divisor is an explicit compile-time constant (i.e. if your x in 10^x is a compile-time constant), there's absolutely no point in using anything else than the language-provided / and % operators. If there a meaningful way to speed them up for explicit powers of 10, any self-respecting compiler will know how to do that and will do that for you.

The only situation when you might think about a "custom" implementation (aside from a dumb compiler) is the situation when x is a run-time value. In that case you'd need some kind of decimal-shift and decimal-and analogy. On a binary machine, a speedup is probably possible, but I doubt that you'll be able to achieve anything practically meaningful. (If the numbers were stored in binary-decimal format, then it would be easy, but in "normal" cases - no.)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  心在旅途        
                
              
                            
                2020-12-03 05:21
              
            
            
                                                                       
In fact you don't need to do anything. The compiler is smart enough to optimize multiplications/divisions with constants. You can find many examples here


Why does GCC use multiplication by a strange number in implementing integer division?
Divide by 10 using bit shifts?
Fast Division on GCC/ARM


You can even do a fast divide by 5 then shift right by 1
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光取名叫无心        
                
              
                            
                2020-12-03 05:22
              
            
            
                                                                       
Not unless you're architecture supports Binary Coded Decimal, and even then only with lots of assembly messiness.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2020-12-03 05:25
              
            
            
                                                                       
If your runtime is genuinely dominated by 10^x-related operations, you could just use a base 10 integer representation in the first place.

In most situations, I'd expect the slowdown of all other integer operations (and reduced precision or potentially extra memory use) would count for more than the faster 10^x operations.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-03 05:28
              
            
            
                                                                       
On a different note instead, it might make more sense to just write a proper version of Div#n# in assembler. Compilers can't always predict the end result as efficiently (though, in most cases, they do it rather well). So if you're running in a low-level microchip environment, consider a hand written asm routine.

#define BitWise_Div10(result, n) {      \
    /*;n = (n >> 1) + (n >> 2);*/           \
    __asm   mov     ecx,eax                 \
    __asm   mov     ecx, dword ptr[n]       \
    __asm   sar     eax,1                   \
    __asm   sar     ecx,2                   \
    __asm   add     ecx,eax                 \
    /*;n += n < 0 ? 9 : 2;*/                \
    __asm   xor     eax,eax                 \
    __asm   setns   al                      \
    __asm   dec     eax                     \
    __asm   and     eax,7                   \
    __asm   add     eax,2                   \
    __asm   add     ecx,eax                 \
    /*;n = n + (n >> 4);*/                  \
    __asm   mov     eax,ecx                 \
    __asm   sar     eax,4                   \
    __asm   add     ecx,eax                 \
    /*;n = n + (n >> 8);*/                  \
    __asm   mov     eax,ecx                 \
    __asm   sar     eax,8                   \
    __asm   add     ecx,eax                 \
    /*;n = n + (n >> 16);*/                 \
    __asm   mov     eax,ecx                 \
    __asm   sar     eax,10h                 \
    __asm   add     eax,ecx                 \
    /*;return n >> 3;}*/                    \
    __asm   sar     eax,3                   \
    __asm   mov     dword ptr[result], eax  \
}


Usage:

int x = 12399;
int r;
BitWise_Div10(r, x); // r = x / 10
// r == 1239


Again, just a note. This is better used on chips that indeed have really bad division. On modern processors and modern compilers, divisions are often optimized out in very clever ways.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复