I was working on a problem in Project Euler; and I found a question in SO. The question and accepted answer says;
n = 600851475143
i = 2
while i * i < n:
The fundamental theorem of arithmetic states that every integer greater than 1 can be represented as the product of prime numbers. E.g., the number 2100 can be represented like so:
2 x 2 x 3 x 5 x 5 x 7
I've arranged it so the largest prime factor is on the right, in this case 7. What this algorithm is doing is starting from 2 and dividing n
(i.e. "removing" that factor) until there are no more to remove (the modulo 0 step checks that it is divisible cleanly before dividing.)
So, following the code, we would have i
= 2 and n
= 2100, or
2 x 2 x 3 x 5 x 5 x 7
2100 is divisible by 2 (2100 % 2 == 0
) and also because we see a 2 in the factorization above. So divide it by 2 and we get 1050, or
2 x 3 x 5 x 5 x 7
Continue dividing by 2, once more, and you get a number that is no longer divisible by 2, that is 525, or
3 x 5 x 5 x 7
Then we increase i
to 3 and continue. See how by the end we will be left with the highest prime factor?
The reason for the first while loop's i * i < n
(which really should be i * i <= n
) is because
if one divisor or factor of a number (other than a perfect square) is greater than its square root, then the other factor will be less than its square root. Hence all multiples of primes greater than the square root of n need not be considered.
from: http://britton.disted.camosun.bc.ca/jberatosthenes.htm
So if i
is greater than the square root of n
, that means all remaining factors would have had a "pair" that we already found, below the square root of n
. The check used, i * i <= n
is equivalent but faster than doing a square root calculation.
The reason this is so quick and other brute force methods are so slow is because this is dividing the number down in each step, which exponentially cuts the number of steps that need to be done.
To see this, the prime factorization of 600851475143 is
71 x 839 x 1471 x 6857
and if you modify the code to read:
n = 600851475143
i = 2
while i * i <= n:
while n%i == 0:
print "Dividing by %d" % i
n = n / i
i = i + 1
if n > 1:
print n
You'll see:
>>>
Dividing by 71
Dividing by 839
Dividing by 1471
6857
which shows you that this is exactly how it's working.