Never mind the benchmarks; the problem got me interested and I made some fast tweaks. This uses the lru_cache
decorator, which memoizes a function; so when we call is_prime(i-6)
we basically get that prime check for free. This change cuts the work roughly in half. Also, we can make the range()
calls step through just the odd numbers, cutting the work roughly in half again.
http://en.wikipedia.org/wiki/Memoization
http://docs.python.org/dev/library/functools.html
This requires Python 3.2 or newer to get lru_cache
, but could work with an older Python if you install a Python recipe that provides lru_cache
. If you are using Python 2.x you should really use xrange()
instead of range()
.
http://code.activestate.com/recipes/577479-simple-caching-decorator/
from functools import lru_cache
import time as time_
@lru_cache()
def is_prime(n):
return n%2 and all(n%i for i in range(3, n, 2))
def primes_below(x):
return [(i-6, i) for i in range(9, x+1, 2) if is_prime(i) and is_prime(i-6)]
correct100 = [(5, 11), (7, 13), (11, 17), (13, 19), (17, 23), (23, 29),
(31, 37), (37, 43), (41, 47), (47, 53), (53, 59), (61, 67), (67, 73),
(73, 79), (83, 89)]
assert(primes_below(100) == correct100)
a = time_.time()
print(primes_below(30*1000))
b = time_.time()
elapsed = b - a
print("{} msec".format(round(elapsed * 1000)))
The above took only a very short time to edit. I decided to take it one step further, and make the primes test only try prime divisors, and only up to the square root of the number being tested. The way I did it only works if you check numbers in order, so it can accumulate all the primes as it goes; but this problem was already checking the numbers in order so that was fine.
On my laptop (nothing special; processor is a 1.5 GHz AMD Turion II "K625") this version produced an answer for 100K in under 8 seconds.
from functools import lru_cache
import math
import time as time_
known_primes = set([2, 3, 5, 7])
@lru_cache(maxsize=128)
def is_prime(n):
last = math.ceil(math.sqrt(n))
flag = n%2 and all(n%x for x in known_primes if x <= last)
if flag:
known_primes.add(n)
return flag
def primes_below(x):
return [(i-6, i) for i in range(9, x+1, 2) if is_prime(i) and is_prime(i-6)]
correct100 = [(5, 11), (7, 13), (11, 17), (13, 19), (17, 23), (23, 29),
(31, 37), (37, 43), (41, 47), (47, 53), (53, 59), (61, 67), (67, 73),
(73, 79), (83, 89)]
assert(primes_below(100) == correct100)
a = time_.time()
print(primes_below(100*1000))
b = time_.time()
elapsed = b - a
print("{} msec".format(round(elapsed * 1000)))
The above code is pretty easy to write in Python, Ruby, etc. but would be more of a pain in C.
You can't compare the numbers on this version against the numbers from the other versions without rewriting the others to use similar tricks. I'm not trying to prove anything here; I just thought the problem was fun and I wanted to see what sort of easy performance improvements I could glean.