The following iterative sequence is defined for the set of positive integers:
n ->n/2 (n is even) n ->3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 40 20 10 5 16 8 4 2 1 It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.
Which starting number, under one million, produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million.
I tried coding a solution to this in C using the bruteforce method. However, it seems that my program stalls when trying to calculate 113383. Please advise :)
#include <stdio.h>
#define LIMIT 1000000
int iteration(int value)
{
if(value%2==0)
return (value/2);
else
return (3*value+1);
}
int count_iterations(int value)
{
int count=1;
//printf("%d\n", value);
while(value!=1)
{
value=iteration(value);
//printf("%d\n", value);
count++;
}
return count;
}
int main()
{
int iteration_count=0, max=0;
int i,count;
for (i=1; i<LIMIT; i++)
{
printf("Current iteration : %d\n", i);
iteration_count=count_iterations(i);
if (iteration_count>max)
{
max=iteration_count;
count=i;
}
}
//iteration_count=count_iterations(113383);
printf("Count = %d\ni = %d\n",max,count);
}
The reason you're stalling is because you pass through a number greater than 2^31-1
(aka INT_MAX
); try using unsigned long long
instead of int
.
I recently blogged about this; note that in C the naive iterative method is more than fast enough. For dynamic languages you may need to optimize by memoizing in order to obey the one minute rule (but this is not the case here).
Oops I did it again (this time examining further possible optimizations using C++).
Notice that your brute force solution often computes the same subproblems over and over again. For example, if you start with 10
, you get 5 16 8 4 2 1
; but if you start with 20
, you get 20 10 5 16 8 4 2 1
. If you cache the value at 10
once it's computed, and then won't have to compute it all over again.
(This is known as dynamic programming.)
I solved the problem some time ago and luckily still have my code. Do not read the code if you don't want a spoiler:
#include <stdio.h>
int lookup[1000000] = { 0 };
unsigned int NextNumber(unsigned int value) {
if ((value % 2) == 0) value >>= 1;
else value = (value * 3) + 1;
return value;
}
int main() {
int i = 0;
int chainlength = 0;
int longest = 0;
int longestchain = 0;
unsigned int value = 0;
for (i = 1; i < 1000000; ++i) {
chainlength = 0;
value = i;
while (value != 1) {
++chainlength;
value = NextNumber(value);
if (value >= 1000000) continue;
if (lookup[value] != 0) {
chainlength += lookup[value];
break;
}
}
lookup[i] = chainlength;
if (longestchain < chainlength) {
longest = i;
longestchain = chainlength;
}
}
printf("\n%d: %d\n", longest, longestchain);
}
time ./a.out
[don't be lazy, run it yourself]: [same here]
real 0m0.106s
user 0m0.094s
sys 0m0.012s
Having just tested it in C#, it appears that 113383 is the first value where the 32-bit int
type becomes too small to store every step in the chain.
Try using an unsigned long
when handling those big numbers ;)
As has been said, the simplest way is to get some memoization to avoid recomputing things that haven't been computed. You might be interested in knowing that there is no cycle if you being from a number under one million (no cycle has been discovered yet, and people have explored much bigger numbers).
To translate it in code, you can think the python way:
MEMOIZER = dict()
def memo(x, func):
global MEMOIZER
if x in MEMOIZER: return MEMOIZER[x]
r = func(x)
MEMOIZER[x] = r
return r
Memoization is a very generic scheme.
For the Collatze conjecture, you might run in a bit of a pinch because numbers can really grow and therefore you might blow up the available memory.
This is traditionally handled using caching, you only cache the last n
results (tailored to occupy a given amount of memory) and when you already have n
items cached and wish to add a newer one, you discard the older one.
For this conjecture there might be another strategy available, though a bit harder to implement. The basic idea is that you have only ways to reach a given number x
:
- from
2*x
- from
(x-1)/3
Therefore if you cache the results of 2*x
and (x-1)/3
there is no point in caching x
any longer >> it'll never get called anymore (except if you wish to print the sequence at the end... but it's only once). I leave it to you to take advantage of this so that your cache does not grow too much :)
My effort in C#, run time < 1 second using LinqPad:
var cache = new Dictionary<long, long>();
long highestcount = 0;
long highestvalue = 0;
for (long a = 1; a < 1000000; a++)
{
long count = 0;
long i = a;
while (i != 1)
{
long cachedCount = 0;
if (cache.TryGetValue(i, out cachedCount)) //See if current value has already had number of steps counted & stored in cache
{
count += cachedCount; //Current value found, return cached count for this value plus number of steps counted in current loop
break;
}
if (i % 2 == 0)
i = i / 2;
else
i = (3 * i) + 1;
count++;
}
cache.Add(a, count); //Store number of steps counted for current value
if (count > highestcount)
{
highestvalue = a;
highestcount = count;
}
}
Console.WriteLine("Starting number:" + highestvalue.ToString() + ", terms:" + highestcount.ToString());
Fixing the unsigned int issue in the original question.
Added array for storing pre-computed values.
include <stdio.h>
#define LIMIT 1000000
unsigned int dp_array[LIMIT+1];
unsigned int iteration(unsigned int value)
{
if(value%2==0)
return (value/2);
else
return (3*value+1);
}
unsigned int count_iterations(unsigned int value)
{
int count=1;
while(value!=1)
{
if ((value<=LIMIT) && (dp_array[value]!=0)){
count+= (dp_array[value] -1);
break;
} else {
value=iteration(value);
count++;
}
}
return count;
}
int main()
{
int iteration_count=0, max=0;
int i,count;
for(i=0;i<=LIMIT;i++){
dp_array[i]=0;
}
for (i=1; i<LIMIT; i++)
{
// printf("Current iteration : %d \t", i);
iteration_count=count_iterations(i);
dp_array[i]=iteration_count;
// printf(" %d \t", iteration_count);
if (iteration_count>max)
{
max=iteration_count;
count=i;
}
// printf(" %d \n", max);
}
printf("Count = %d\ni = %d\n",max,count);
}
o/p: Count = 525 i = 837799
Haskell solution, 2 second run time.
thomashartman@yucca:~/collatz>ghc -O3 -fforce-recomp --make collatz.hs
[1 of 1] Compiling Main ( collatz.hs, collatz.o )
Linking collatz ...
thomashartman@yucca:~/collatz>time ./collatz
SPOILER REDACTED
real 0m2.881s
-- Maybe I could have gotten it a bit faster using a hash instead of a map.
import qualified Data.Map as M
import Control.Monad.State.Strict
import Data.List (maximumBy)
import Data.Function (on)
nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
| otherwise = 3 * n + 1
newtype CollatzLength = CollatzLength Integer
deriving (Read,Show,Eq,Ord)
main = print longestCollatzSequenceUnderAMill
longestCollatzSequenceUnderAMill = longestCollatzLength [1..1000000]
-- sanity checks
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13
tCollatzLengthMemoized = (CollatzLength 10) == evalState (collatzLengthMemoized 13) M.empty
-- theoretically could be nonterminating. Since we're not in Agda, we'll not worry about it.
collatzLengthNaive :: Integer -> CollatzLength
collatzLengthNaive 1 = CollatzLength 1
collatzLengthNaive n = let CollatzLength nextLength = collatzLengthNaive (nextCollatz n)
in CollatzLength $ 1 + nextLength
-- maybe it would be better to use hash here?
type CollatzLengthDb = M.Map Integer CollatzLength
type CollatzLengthState = State CollatzLengthDb
-- handy for testing
cLM :: Integer -> CollatzLength
cLM n = flip evalState M.empty $ (collatzLengthMemoized n)
collatzLengthMemoized :: Integer -> CollatzLengthState CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
lengthsdb <- get
case M.lookup n lengthsdb of
Nothing -> do let n' = nextCollatz n
CollatzLength lengthN' <- collatzLengthMemoized n'
put $ M.insert n' (CollatzLength lengthN') lengthsdb
return $ CollatzLength $ lengthN' + 1
Just lengthN -> return lengthN
longestCollatzLength :: [Integer] -> (Integer,CollatzLength)
longestCollatzLength xs = flip evalState M.empty $ do
foldM f (1,CollatzLength 1) xs
where f maxSoFar@(maxN,lengthMaxN) nextN = do
lengthNextN <- collatzLengthMemoized nextN
let newMaxCandidate = (nextN,lengthNextN)
return $ maximumBy (compare `on` snd) [maxSoFar, newMaxCandidate]
================================================================================
And here is another haskell solution, using monad-memo package. Unfortunately, this one has a stack space error that does not affect the rolled-my-own memoizer above.
./collatzMemo +RTS -K83886080 -RTS # this produces the answer, but it would be bettter to eliminate the space leak
{-# Language GADTs, TypeOperators #-}
import Control.Monad.Memo
import Data.List (maximumBy)
import Data.Function (on)
nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
| otherwise = 3 * n + 1
newtype CollatzLength = CollatzLength Integer
deriving (Read,Show,Eq,Ord)
main = print longestCollatzSequenceUnderAMill
longestCollatzSequenceUnderAMill = longestCollatzLength [1..1000000]
collatzLengthMemoized :: Integer -> Memo Integer CollatzLength CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
CollatzLength nextLength <- memo collatzLengthMemoized (nextCollatz n)
return $ CollatzLength $ 1 + nextLength
{- Stack space error
./collatzMemo
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
Stack error does not effect rolled-my-own memoizer at
http://stackoverflow.com/questions/2643260/project-euler-question-14-collatz-problem
-}
longestCollatzLength :: [Integer] -> (Integer,CollatzLength)
longestCollatzLength xs = startEvalMemo $ do
foldM f (1,CollatzLength 1) xs
where f maxSoFar nextN = do
lengthNextN <- collatzLengthMemoized nextN
let newMaxCandidate = (nextN,lengthNextN)
return $ maximumBy (compare `on` snd) [maxSoFar, newMaxCandidate]
{-
-- sanity checks
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13
tCollatzLengthMemoized = (CollatzLength 10) ==startEvalMemo (collatzLengthMemoized 13)
-- theoretically could be nonterminating. Since we're not in Agda, we'll not worry about it.
collatzLengthNaive :: Integer -> CollatzLength
collatzLengthNaive 1 = CollatzLength 1
collatzLengthNaive n = let CollatzLength nextLength = collatzLengthNaive (nextCollatz n)
in CollatzLength $ 1 + nextLength
-}
==================================================
another one, factored more nicely. doesn't run as fast but still well under a minute
import qualified Data.Map as M
import Control.Monad.State
import Data.List (maximumBy, nubBy)
import Data.Function (on)
nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
| otherwise = 3 * n + 1
newtype CollatzLength = CollatzLength Integer
deriving (Read,Show,Eq,Ord)
main = print longestCollatzSequenceUnderAMillStreamy -- AllAtOnce
collatzes = evalState collatzesM M.empty
longestCollatzSequenceUnderAMillAllAtOnce = winners . takeWhile ((<=1000000) .fst) $ collatzes
longestCollatzSequenceUnderAMillStreamy = takeWhile ((<=1000000) .fst) . winners $ collatzes
-- sanity checks
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13
tCollatzLengthMemoized = (CollatzLength 10) == evalState (collatzLengthMemoized 13) M.empty
-- maybe it would be better to use hash here?
type CollatzLengthDb = M.Map Integer CollatzLength
type CollatzLengthState = State CollatzLengthDb
collatzLengthMemoized :: Integer -> CollatzLengthState CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
lengthsdb <- get
case M.lookup n lengthsdb of
Nothing -> do let n' = nextCollatz n
CollatzLength lengthN' <- collatzLengthMemoized n'
put $ M.insert n' (CollatzLength lengthN') lengthsdb
return $ CollatzLength $ lengthN' + 1
Just lengthN -> return lengthN
collatzesM :: CollatzLengthState [(Integer,CollatzLength)]
collatzesM = mapM (\x -> do (CollatzLength l) <- collatzLengthMemoized x
return (x,(CollatzLength l)) ) [1..]
winners :: Ord b => [(a, b)] -> [(a, b)]
winners xs = (nubBy ( (==) `on` snd )) $ scanl1 (maxBy snd) xs
maxBy :: Ord b => (a -> b) -> a -> a -> a
maxBy f x y = if f x > f y then x else y
来源:https://stackoverflow.com/questions/2643260/project-euler-question-14-collatz-problem