问题
How do I compute C's %
using Python's %
?
The difference between the two is in the way they handle the case of negative arguments.
In both languages, the %
is defined in such a way that this relationship (//
being integer division) holds:
a // b * b + a % b == a
but the rounding of a // b
is different in C and in Python, leading to a different definition of a % b
.
For example, in C (where integer division is just /
with int
operands) we have:
int a = 31;
int b = -3;
a / b; // -10
a % b; // 1
while in Python:
a = 31
b = -3
a // b # -11
a % b # -2
I am aware of this question, which addresses the opposite (i.e. how to compute Python's %
from C's %
) and contains additional discussions.
I am also aware of Python 3.7 math
module introducing remainder() but its result is a float
, not an int
and hence it will not enjoy arbitrary precision.
回答1:
Some ways would be:
def mod_c0(a, b):
if b < 0:
b = -b
return -1 * (-a % b) if a < 0 else a % b
def mod_c1(a, b):
return (-1 if a < 0 else 1) * ((a if a > 0 else -a) % (b if b > 0 else -b))
def mod_c2(a, b):
return (-1 if a < 0 else 1) * (abs(a) % abs(b))
def mod_c3(a, b):
r = a % b
return (r - b) if (a < 0) != (b < 0) and r != 0 else r
def mod_c4(a, b):
r = a % b
return (r - b) if (a * b < 0) and r != 0 else r
def mod_c5(a, b):
return a % (-b if a ^ b < 0 else b)
def mod_c6(a, b):
a_xor_b = a ^ b
n = a_xor_b.bit_length()
x = a_xor_b >> n
return a % (b * (x | 1))
def mod_c7(a, b):
a_xor_b = a ^ b
n = a_xor_b.bit_length()
x = a_xor_b >> n
return a % ((-b & x) | (b & ~x))
def mod_c8(a, b):
q, r = divmod(a, b)
if (a >= 0) != (b >= 0) and r:
q += 1
return a - q * b
def mod_c9(a, b):
if a >= 0:
if b >= 0:
return a % b
else:
return a % -b
else:
if b >= 0:
return -(-a % b)
else:
return a % b
which all work as expected, e.g.:
print(mod_c0(31, -3))
# 1
Essentially, mod_c0()
implements an optimized version of mod_c1()
and mod_c2()
, which are identical except that in mod_c1()
the call to (relatively expensive) call to abs()
is replaced by a ternary conditional operator with the same semantic.
Instead, mod_c3()
and mod_c4()
try to directly fix the a % b
value for the cases where it is needed. The difference between the two is in how they detect opposite signs of the arguments: (a < 0) != (b != 0)
versus a * b < 0
.
The mod_c5()
approach is inspired by @ArborealAnole's answer, and essentially uses the bit-wise xor to handle the cases correctly, while mod_c6()
and mod_c7()
are the same as @ArborealAnole's answer but using adaptive right shift with int.bit_length()
.
The mod_c8()
approach uses a corrected definition of integer division to fix up the modulus value.
The mod_c9()
method is inspired by @NeverGoodEnough's answer, and essentially goes full conditional.
Covering all sign cases:
vals = (3, -3, 31, -31)
s = '{:<{n}}' * 4
n = 14
print(s.format('a', 'b', 'mod(a, b)', 'mod_c(a, b)', n=n))
print(s.format(*(('-' * (n - 1),) * 4), n=n))
for a, b in itertools.product(vals, repeat=2):
print(s.format(a, b, mod(a, b), mod_c0(a, b), n=n))
a b mod(a, b) mod_c(a, b)
------------- ------------- ------------- -------------
3 3 0 0
3 -3 0 0
3 31 3 3
3 -31 -28 3
-3 3 0 0
-3 -3 0 0
-3 31 28 -3
-3 -31 -3 -3
31 3 1 1
31 -3 -2 1
31 31 0 0
31 -31 0 0
-31 3 2 -1
-31 -3 -1 -1
-31 31 0 0
-31 -31 0 0
A bit more tests and benchmarks:
n = 100
k = 1
l = [x for x in range(-n, n + k, k)]
ll = [(a, b) for a, b in itertools.product(l, repeat=2) if b]
funcs = mod_c0, mod_c1, mod_c2, mod_c3, mod_c4, mod_c5, mod_c6, mod_c7, mod_c8, mod_c9
for func in funcs:
correct = all(func(a, b) == funcs[0](a, b) for a, b in ll)
print(func.__name__, 'correct:', all_equal)
%timeit [func(a, b) for a, b in ll]
print()
100 loops, best of 3: 6.6 ms per loop
mod_c1 correct: True
100 loops, best of 3: 7.86 ms per loop
mod_c2 correct: True
100 loops, best of 3: 8.49 ms per loop
mod_c3 correct: True
100 loops, best of 3: 7.56 ms per loop
mod_c4 correct: True
100 loops, best of 3: 7.5 ms per loop
mod_c5 correct: True
100 loops, best of 3: 7.94 ms per loop
mod_c6 correct: True
100 loops, best of 3: 13.4 ms per loop
mod_c7 correct: True
100 loops, best of 3: 16.8 ms per loop
mod_c8 correct: True
100 loops, best of 3: 12.4 ms per loop
mod_c9 correct: True
100 loops, best of 3: 6.48 ms per loop
Perhaps there are better (shorter?, faster?) ways, given that the implementation of Python's %
using C's %
seems much simpler:
((a % b) + b) % b
To get some feeling on how the C-style %
computation (mod_c*()
functions from above) stands against the usual %
or the operations required to get Python-style %
from C
:
def mod_py(a, b):
return a % b
def mod_c2py(a, b):
return ((a % b) + b) % b
%timeit [mod_py(a, b) for a, b in ll]
# 100 loops, best of 3: 5.85 ms per loop
%timeit [mod_c2py(a, b) for a, b in ll]
# 100 loops, best of 3: 7.84 ms per loop
Note of course that mod_c2py()
is only useful to get a feeling of what performances we could expect from a mod_c()
function.
(EDITED to fix some of the proposed methods and include some timings)
(EDITED-2 to add the mod_c5()
solution)
(EDITED-3 to add the mod_c6()
to mod_c9()
solutions)
回答2:
I am following up the very comprehensive answer of @norok2. I have tried the super-naive approach with branches, and it appears to be slightly but consistently faster (~2-4%).
def mod_naive(x,y):
if y < 0:
if x < 0:
return x%y
else:
return (x%-y)
else:
if x < 0:
return -(-x%y)
else:
return x%y
or with a lambda (does not affect speed, only coolness):
mod_naive = lambda x,y: (x%y if x < 0 else x%-y) if y < 0 else (-(-x%y) if x < 0 else x%y)
Compared to @norok2's fastest solution (mod_c0
):
mod_c0 correct: True 100 loops, best of 3: 6.86 ms per loop mod_naive correct: True 100 loops, best of 3: 6.58 ms per loop
My (naive) guess on the reason why is that the branch prediction algorithms will eventually produce less operations overall.
回答3:
For 64-bit integers, either of these should work:
def mod_c_AA0(a,b):
x=(a^b)>>63
return a % (b*(x|1))
def mod_c_AA1(a,b):
x=(a^b)>>63
return a % ((-b & x)|(b & ~x))
using two's complement binary. As norok2 suggests, substitute a_xor_b=a^b; x=a_xor_b>>a_xor_b.bit_length();
for the first line to have optimal specificity on bit shifting depending on the magnitude of a
and b
.
来源:https://stackoverflow.com/questions/61346630/compute-cs-using-pythons