问题
I have a prime number generator, I was curious to see how small and how fast I could get a prime number generator to be based on optimizations and such:
from math import sqrt
def p(n):
if n < 2: return []
s = [True]*(((n/2)-1+n%2)+1)
for i in range(int(sqrt(n)) >> 1):
if not s[i]: continue
for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False
q = [2]; q.extend([(i<<1) + 3 for i in range(((n/2)-1+n%2)) if s[i]]); return len(q), q
print p(input())
The generator works great! It is super fast, feel free to try it out. However, if you input numbers greater than 10^9 or 10^10 (i think) it will crash from a memory error. I can't figure out how to expand the memory it uses so that it can take as much as it needs. Any advice would be greatly appreciated!
My question is very similar to this one, but this is Python, not C.
EDIT: This is one of the memory related tracebacks I get for trying to run 10^9.
python prime.py
1000000000
Traceback (most recent call last):
File "prime.py", line 9, in <module>
print p(input())
File "prime.py", line 7, in p
for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False
MemoryError
回答1:
The Problem is in line 7.
for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False
especially this part: i**i
1000000000^1000000000 is a 9 * 10^9 digit long number. Storing it takes multiple Gb if not Tb (WolframAlpha couldn't caclulate it anymore). I know that i ist the square root of n (maximal), but at that large numbers that's not a big difference.
You have to split this caclulation into smaller parts if posible and safe it on a hard drive. This makes the process slow, but doable.
回答2:
First of all, there is a problem since the generator says that numbers like 33, 35 and 45 are prime.
Other than that, there are several structures taking up memory here:
s = [True]*(((n/2)-1+n%2)+1)
A list element takes up several bytes per element. For n = 1 billion the s
array is consuming gigabytes.
range(...)
creates a list and then iterates over the elements. Usexrange(...)
instead where possible.
Converting range()
to xrange()
has pitfalls - e.g. see this SO answer:
OverflowError Python int too large to convert to C long
A better implementation of s
is to use a Python integer as a bit-array which has a density of 8 elements per byte. Here is a translation between using a list and a integer:
s = [True]*(((n/2)-1+n%2)+1) t = (1 << (n/2)+1)-1
s[i] (t & (1<<i))
not s[i] not (t & (1<<i))
s[j] = False m = 1<<j
if (t & m): t ^= m
Update
Here's an unoptimized version which uses yield
and xrange
. For larger values of n
take care of the limitations of xrange
as noted above.
def primes(n):
if n < 2: return
yield 2
end = int( sqrt(n) )
t = (1 << n) -1
for p in xrange(3, end, 2):
if not (t & (1 << p)): continue
yield p
for q in xrange(p*p, n, p):
m = t & (1<<q)
if (t&m): t ^= m
continue
for p in xrange(end - (end%2) +1, n, 2):
if not (t & (1 << p)): continue
yield p
def test(n):
for p in primes(n): print p
test(100000)
来源:https://stackoverflow.com/questions/37617615/prime-number-generator-crashes-from-memory-error-if-there-are-too-many-numbers-i