In my program I use a lot of integer division by 10^x and integer mod function of power 10.
For example:
unsigned __int64 a = 12345;
a = a / 100;
...
You can also take a look at the libdivide project. It is designed to speed-up the integer division, in the general case.
If the divisor is an explicit compile-time constant (i.e. if your x
in 10^x is a compile-time constant), there's absolutely no point in using anything else than the language-provided /
and %
operators. If there a meaningful way to speed them up for explicit powers of 10, any self-respecting compiler will know how to do that and will do that for you.
The only situation when you might think about a "custom" implementation (aside from a dumb compiler) is the situation when x
is a run-time value. In that case you'd need some kind of decimal-shift and decimal-and analogy. On a binary machine, a speedup is probably possible, but I doubt that you'll be able to achieve anything practically meaningful. (If the numbers were stored in binary-decimal format, then it would be easy, but in "normal" cases - no.)
In fact you don't need to do anything. The compiler is smart enough to optimize multiplications/divisions with constants. You can find many examples here
You can even do a fast divide by 5 then shift right by 1
Not unless you're architecture supports Binary Coded Decimal, and even then only with lots of assembly messiness.
If your runtime is genuinely dominated by 10x-related operations, you could just use a base 10 integer representation in the first place.
In most situations, I'd expect the slowdown of all other integer operations (and reduced precision or potentially extra memory use) would count for more than the faster 10x operations.
On a different note instead, it might make more sense to just write a proper version of Div#n# in assembler. Compilers can't always predict the end result as efficiently (though, in most cases, they do it rather well). So if you're running in a low-level microchip environment, consider a hand written asm routine.
#define BitWise_Div10(result, n) { \
/*;n = (n >> 1) + (n >> 2);*/ \
__asm mov ecx,eax \
__asm mov ecx, dword ptr[n] \
__asm sar eax,1 \
__asm sar ecx,2 \
__asm add ecx,eax \
/*;n += n < 0 ? 9 : 2;*/ \
__asm xor eax,eax \
__asm setns al \
__asm dec eax \
__asm and eax,7 \
__asm add eax,2 \
__asm add ecx,eax \
/*;n = n + (n >> 4);*/ \
__asm mov eax,ecx \
__asm sar eax,4 \
__asm add ecx,eax \
/*;n = n + (n >> 8);*/ \
__asm mov eax,ecx \
__asm sar eax,8 \
__asm add ecx,eax \
/*;n = n + (n >> 16);*/ \
__asm mov eax,ecx \
__asm sar eax,10h \
__asm add eax,ecx \
/*;return n >> 3;}*/ \
__asm sar eax,3 \
__asm mov dword ptr[result], eax \
}
Usage:
int x = 12399;
int r;
BitWise_Div10(r, x); // r = x / 10
// r == 1239
Again, just a note. This is better used on chips that indeed have really bad division. On modern processors and modern compilers, divisions are often optimized out in very clever ways.