Prime number generator crashes from memory error if there are too many numbers in array

£可爱£侵袭症+ 提交于 2020-01-07 02:29:24

问题


I have a prime number generator, I was curious to see how small and how fast I could get a prime number generator to be based on optimizations and such:

from math import sqrt
def p(n):  
  if n < 2: return []  
  s = [True]*(((n/2)-1+n%2)+1)  
  for i in range(int(sqrt(n)) >> 1):  
    if not s[i]: continue  
    for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False   
    q = [2]; q.extend([(i<<1) + 3 for i in range(((n/2)-1+n%2)) if s[i]]); return len(q), q  
print p(input())

The generator works great! It is super fast, feel free to try it out. However, if you input numbers greater than 10^9 or 10^10 (i think) it will crash from a memory error. I can't figure out how to expand the memory it uses so that it can take as much as it needs. Any advice would be greatly appreciated!

My question is very similar to this one, but this is Python, not C.

EDIT: This is one of the memory related tracebacks I get for trying to run 10^9.

python prime.py
1000000000
Traceback (most recent call last):
  File "prime.py", line 9, in <module>
    print p(input())
  File "prime.py", line 7, in p
    for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False
MemoryError

回答1:


The Problem is in line 7.

for j in range( (i**i+(3*i) << 1) + 3, ((n/2)-1+n%2), (i<<1)+3): s[j] = False

especially this part: i**i

1000000000^1000000000 is a 9 * 10^9 digit long number. Storing it takes multiple Gb if not Tb (WolframAlpha couldn't caclulate it anymore). I know that i ist the square root of n (maximal), but at that large numbers that's not a big difference.

You have to split this caclulation into smaller parts if posible and safe it on a hard drive. This makes the process slow, but doable.




回答2:


First of all, there is a problem since the generator says that numbers like 33, 35 and 45 are prime.

Other than that, there are several structures taking up memory here:

  • s = [True]*(((n/2)-1+n%2)+1)

A list element takes up several bytes per element. For n = 1 billion the s array is consuming gigabytes.

  • range(...) creates a list and then iterates over the elements. Use xrange(...) instead where possible.

Converting range() to xrange() has pitfalls - e.g. see this SO answer:

OverflowError Python int too large to convert to C long

A better implementation of s is to use a Python integer as a bit-array which has a density of 8 elements per byte. Here is a translation between using a list and a integer:

 s = [True]*(((n/2)-1+n%2)+1)        t = (1 << (n/2)+1)-1

 s[i]                                (t & (1<<i))
 not s[i]                            not (t & (1<<i))

 s[j] = False                        m = 1<<j
                                     if (t & m): t ^= m

Update

Here's an unoptimized version which uses yield and xrange. For larger values of n take care of the limitations of xrange as noted above.

def primes(n):
  if n < 2: return
  yield 2
  end = int( sqrt(n) )
  t = (1 << n) -1

  for p in xrange(3, end, 2):
    if not (t & (1 << p)): continue
    yield p
    for q in xrange(p*p, n, p):
      m = t & (1<<q)
      if (t&m): t ^= m
      continue

  for p in xrange(end - (end%2) +1, n, 2):
    if not (t & (1 << p)): continue
    yield p


def test(n):
  for p in primes(n): print p

test(100000)


来源:https://stackoverflow.com/questions/37617615/prime-number-generator-crashes-from-memory-error-if-there-are-too-many-numbers-i

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!