I have minimize cost of calculating modulus in C. say I have a number x and n is the number which will divide x
when n == 65536 (which happens to be 2^16):
m
idiv — Integer Division
The idiv instruction divides the contents of the 64 bit integer EDX:EAX (constructed by viewing EDX as the most significant four bytes and EAX as the least significant four bytes) by the specified operand value. The quotient result of the division is stored into EAX, while the remainder is placed in EDX.
source: http://www.cs.virginia.edu/~evans/cs216/guides/x86.html
As an approach when we deal with powers of 2, can be considered this one (mostly C flavored):
.
.
#define THE_DIVISOR 0x8U; /* The modulo value (POWER OF 2). */
.
.
uint8 CheckIfModulo(const sint32 TheDividend)
{
uint8 RetVal = 1; /* TheDividend is not modulus THE_DIVISOR. */
if (0 == (TheDividend & (THE_DIVISOR - 1)))
{
/* code if modulo is satisfied */
RetVal = 0; /* TheDividend IS modulus THE_DIVISOR. */
}
else
{
/* code if modulo is NOT satisfied */
}
return RetVal;
}
If x
is an increasing index, and the increment i
is known to be less than n
(e.g. when iterating over a circular array of length n), avoid the modulus completely.
A loop going
x += i; if (x >= n) x -= n;
is way faster than
x = (x + i) % n;
which you unfortunately find in many text books...
If you really need an expression (e.g. because you are using it in a for
statement), you can use the ugly but efficient
x = x + (x+i < n ? i : i-n)
x mod 65536 is only equivalent to x & 0xffff if x is unsigned - for signed x, it gives the wrong result for negative numbers. For unsigned x, gcc does indeed optimise x % 65536
to a bitwise and with 65535 (even on -O0, in my tests).
Because 65521 is not a power of 2, x mod 65521 can't be calculated so simply. gcc 4.3.2 on -O3 calculates it using x - (x / 65521) * 65521
; the integer division by a constant is done using integer multiplication by a related constant.
The bitwise operation only works well if the divisor is of the form 2^n
. In the general case, there is no such bit-wise operation.
If the constant with which you want to take the modulo is known at compile time and you have a decent compiler (e.g. gcc), tis usually best to let the compiler work its magic. Just declare the modulo const.
If you don't know the constant at compile time, but you are going to take - say - a billion modulos with the same number, then use this http://libdivide.com/