Here\'s my code:
def factorize(n):
sieve = [True] * (n + 1)
for x in range(2, int(len(sieve) ** 0.5) + 1):
if sieve[x]:
for i i
The code in the answer by Blender is very nice, but that algorithm is lacking in one very important respect: it tests way too much. E.g. trying to factorize n=392798360393
, which is a prime number, it will try to divide it by all the numbers below it (including itself). This will take a lot of time.
Is it really necessary? If n == a*b
and a < b
, having found a
we don't really need to test divide n
by b
. We know that it too divides n
, because n/a == b
implies n/b == a
. So we only need to test while the potential factor a
is smaller than (or equal to) the potential factor b
. That is to say, until the square root of the number is reached.
Also, instead of starting over from 2 for each reduced n
it's OK to start from the previous value of i:
def factors(n): # (cf. https://stackoverflow.com/a/15703327/849891)
j = 2
while n > 1:
for i in xrange(j, int(sqrt(n+0.05)) + 1):
if n % i == 0:
n /= i ; j = i
yield i
break
else:
if n > 1:
yield n; break
Indeed, factoring 9000009
by this code takes 0.08 seconds on Ideone, instead of 0.59 seconds.
This is guaranteed to produce only primes (because we divide out each factor found, and we try the candidates in non-decreasing order). The overhead of generating primes first and then testing only by the primes (a.o.t. testing by all numbers, as done here above) will be worth it if we are factoring several numbers at once; for factoring just one number it might not be worth it, depending on the speed of your prime generation.
But then, what really should be done when factoring several numbers at once, is to create the smallest factor sieve first where we mark each number in a given range by its smallest (prime) factor (from which it was generated by the sieve) instead of just by True
or False
as in the sieve of Eratosthenes. This smallest factor sieve is then used for efficient factorization of each of the numbers given, in the same range, by successive division by their factors, from the smallest up, which are efficiently found by a direct look-up in the sieve instead of testing anew:
def sfs_factorize(nums):
top = max(nums)
sv = smallest_factor_sieve(top)
for k in nums:
fs = [] ; n = k
while n > 1:
f = sv[n] # no testing
n /= f
fs.append(f)
print( k, list(fs))