What are the fastest divisibility tests? Say, given a little-endian architecture and a 32-bit signed integer: how to calculate very fast that a number is divisible by 2,3,4,
A method that can help modulo reduction of all integer values uses bit-slicing and popcount.
mod3 = pop(x & 0x55555555) + pop(x & 0xaaaaaaaa) << 1; // <- one term is shared!
mod5 = pop(x & 0x99999999) + pop(x & 0xaaaaaaaa) << 1 + pop(x & 0x44444444) << 2;
mod7 = pop(x & 0x49249249) + pop(x & 0x92492492) << 1 + pop(x & 0x24924924) << 2;
modB = pop(x & 0x5d1745d1) + pop(x & 0xba2e8ba2) << 1 +
pop(x & 0x294a5294) << 2 + pop(x & 0x0681a068) << 3;
modD = pop(x & 0x91b91b91) + pop(x & 0xb2cb2cb2) << 1 +
pop(x & 0x64a64a64) << 2 + pop(x & 0xc85c85c8) << 3;
The maximum values for these variables are 48, 80, 73, 168 and 203, which all fit into 8-bit variables. The second round can be carried in parallel (or some LUT method can be applied)
mod3 mod3 mod5 mod5 mod5 mod7 mod7 mod7 modB modB modB modB modD modD modD modD
mask 0x55 0xaa 0x99 0xaa 0x44 0x49 0x92 0x24 0xd1 0xa2 0x94 0x68 0x91 0xb2 0x64 0xc8
shift *1 *2 *1 *2 *4 *1 *2 *4 *1 *2 *4 *8 *1 *2 *4 *8
sum <-------> <------------> <-----------> <-----------------> <----------------->
This probably won't help you in code, but there's a neat trick which can help do this in your head in some cases:
For divide by 3: For a number represented in decimal, you can sum all the digits, and check if the sum is divisible by 3.
Example: 12345 => 1+2+3+4+5 = 15 => 1+5 = 6
, which is divisible by 3 (3 x 4115 = 12345)
.
More interestingly the same technique works for all factors of X-1, where X is the base in which the number is represented. So for decimal number, you can check divide by 3 or 9. For hex, you can check divide by 3,5 or 15. And for octal numbers, you can check divide by 7.
In every case (including divisible by 2):
if (number % n == 0) do();
Anding with a mask of low order bits is just obfuscation, and with a modern compiler will not be any faster than writing the code in a readable fashion.
If you have to test all of the cases, you might improve performance by putting some of the cases in the if
for another: there's no point it testing for divisibility by 4 if divisibility by 2 has already failed, for example.
A bit of evil, obfuscated bit-twiddling can get you divisbility by 15.
For a 32-bit unsigned number:
def mod_15ish(unsigned int x) {
// returns a number between 0 and 21 that is either x % 15
// or 15 + (x % 15), and returns 0 only for x == 0
x = (x & 0xF0F0F0F) + ((x >> 4) & 0xF0F0F0F);
x = (x & 0xFF00FF) + ((x >> 8) & 0xFF00FF);
x = (x & 0xFFFF) + ((x >> 16) & 0xFFFF);
// *1
x = (x & 0xF) + ((x >> 4) & 0xF);
return x;
}
def Divisible_by_15(unsigned int x) {
return ((x == 0) || (mod_15ish(x) == 15));
}
You can build similar divisibility routines for 3
and 5
based on mod_15ish
.
If you have 64-bit unsigned ints to deal with, extend each constant above the *1
line in the obvious way, and add a line above the *1
line to do a right shift by 32 bits with a mask of 0xFFFFFFFF
. (The last two lines can stay the same) mod_15ish
then obeys the same basic contract, but the return value is now between 0
and 31
. (so what's maintained is that x % 15
== mod_15ish(x) % 15
)