问题
What is an efficient way for working with large prime numbers with Python? You search on here or on google, and you find many different methods for doing so... sieves, primality test algorithms... Which ways work for larger primes?
回答1:
For determining if a number is a prime, there a sieves and primality tests.
# for large numbers, xrange will throw an error.
# OverflowError: Python int too large to convert to C long
# to get over this:
def mrange(start, stop, step):
while start < stop:
yield start
start += step
# benchmarked on an old single-core system with 2GB RAM.
from math import sqrt
def is_prime(num):
if num == 2:
return True
if (num < 2) or (num % 2 == 0):
return False
return all(num % i for i in mrange(3, int(sqrt(num)) + 1, 2))
# benchmark is_prime(100**10-1) using mrange
# 10000 calls, 53191 per second.
# 60006 function calls in 0.190 seconds.
This seems to be the fastest. There is another version using not any
that you see,
def is_prime(num)
# ...
return not any(num % i == 0 for i in mrange(3, int(sqrt(num)) + 1, 2))
However, in the benchmarks I got 70006 function calls in 0.272 seconds.
over the use of all
60006 function calls in 0.190 seconds.
while testing if 100**10-1
was prime.
If you're needing to find the next highest prime, this method will not work for you. You need to go with a primality test, I have found the Miller-Rabin algorithm to be a good choice. It is a little slower the Fermat method, but more accurate against pseudoprimes. Using the above mentioned method takes +5 minutes on this system.
Miller-Rabin
algorithm:
from random import randrange
def is_prime(n, k=10):
if n == 2:
return True
if not n & 1:
return False
def check(a, s, d, n):
x = pow(a, d, n)
if x == 1:
return True
for i in xrange(s - 1):
if x == n - 1:
return True
x = pow(x, 2, n)
return x == n - 1
s = 0
d = n - 1
while d % 2 == 0:
d >>= 1
s += 1
for i in xrange(k):
a = randrange(2, n - 1)
if not check(a, s, d, n):
return False
return True
Fermat
algoithm:
def is_prime(num):
if num == 2:
return True
if not num & 1:
return False
return pow(2, num-1, num) == 1
To get the next highest prime:
def next_prime(num):
if (not num & 1) and (num != 2):
num += 1
if is_prime(num):
num += 2
while True:
if is_prime(num):
break
num += 2
return num
print next_prime(100**10-1) # returns `100000000000000000039`
# benchmark next_prime(100**10-1) using Miller-Rabin algorithm.
1000 calls, 337 per second.
258669 function calls in 2.971 seconds
Using the Fermat
test, we got a benchmark of 45006 function calls in 0.885 seconds.
, but you run a higher chance of pseudoprimes.
So, if just needing to check if a number is prime or not, the first method for is_prime
works just fine. It is the fastest, if you use the mrange
method with it.
Ideally, you would want to store the primes generated by next_prime
and just read from that.
For example, using next_prime
with the Miller-Rabin
algorithm:
print next_prime(10^301)
# prints in 2.9s on the old single-core system, opposed to fermat's 2.8s
1000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000531
You wouldn't be able to do this with return all(num % i for i in mrange(3, int(sqrt(num)) + 1, 2))
in a timely fashion. I can't even do it on this old system.
And to be sure that next_prime(10^301)
and Miller-Rabin
yields a correct value, this was also tested using the Fermat
and the Solovay-Strassen
algorithms.
See: fermat.py, miller_rabin.py, and solovay_strassen.py on gist.github.com
Edit: Fixed a bug in next_prime
回答2:
I have written two articles on this, along with benchmarks, to see what methods are faster.
Prime Numbers with Python was written before the knowledge of the Baillie-PSW
primality test.
Prime Numbers with Python v2 was written afterwards, benchmarking the Lucas pseudoprimes
and the Baillie-PSW
primality test.
回答3:
In response to the possible inaccuracy of math.sqrt
I have benchmarked two different methods for performing an isqrt(n)
call. isqrt_2(n)
is coming from this article and this C code
The most common method seen:
def isqrt_1(n):
x = n
while True:
y = (n // x + x) // 2
if x <= y:
return x
x = y
cProfile.run('isqrt_2(10**308)')
Benchmark results:
isqrt_1 at 10000 iterations: 12.25
Can perform 816 calls per second.
10006 function calls in 12.904 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.904 12.904 <string>:1(<module>)
1 0.690 0.690 12.904 12.904 math.py:10(func)
10000 12.213 0.001 12.213 0.001 math.py:24(isqrt_1)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {range}
2 0.000 0.000 0.000 0.000 {time.time}
This method is incredibly slow. So we try the next method:
def isqrt_2(n):
if n < 0:
raise ValueError('Square root is not defined for negative numbers.')
x = int(n)
if x == 0:
return 0
a, b = divmod(x.bit_length(), 2)
n = 2 ** (a + b)
while True:
y = (n + x // n) >> 1
if y >= n:
return n
n = y
cProfile.run('isqrt_2(10**308)')
Benchmark results:
isqrt_2 at 10000 iterations: 0.391000032425
Can perform 25575 calls per second.
30006 function calls in 1.059 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.059 1.059 <string>:1(<module>)
1 0.687 0.687 1.059 1.059 math.py:10(func)
10000 0.348 0.000 0.372 0.000 math.py:34(isqrt_2)
10000 0.013 0.000 0.013 0.000 {divmod}
10000 0.011 0.000 0.011 0.000 {method 'bit_length' of 'long' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 {range}
2 0.000 0.000 0.000 0.000 {time.time}
As you can see, the difference in isqrt_1(n)
and isqrt_2(n)
is an amazing 11.858999967575 seconds
faster.
You can see this in action on Ideone.com or get the code
note: Ideone.com resulted in execution timeout for isqrt_1(n)
so the benchmark was reduced to 10**200
来源:https://stackoverflow.com/questions/17298130/working-with-large-primes-in-python