The following iterative sequence is defined for the set of positive integers:
n ->n/2 (n is even) n ->3n + 1 (n is odd)
Using the rule above and starting wit
Having just tested it in C#, it appears that 113383 is the first value where the 32-bit int
type becomes too small to store every step in the chain.
Try using an unsigned long
when handling those big numbers ;)
My effort in C#, run time < 1 second using LinqPad:
var cache = new Dictionary<long, long>();
long highestcount = 0;
long highestvalue = 0;
for (long a = 1; a < 1000000; a++)
{
long count = 0;
long i = a;
while (i != 1)
{
long cachedCount = 0;
if (cache.TryGetValue(i, out cachedCount)) //See if current value has already had number of steps counted & stored in cache
{
count += cachedCount; //Current value found, return cached count for this value plus number of steps counted in current loop
break;
}
if (i % 2 == 0)
i = i / 2;
else
i = (3 * i) + 1;
count++;
}
cache.Add(a, count); //Store number of steps counted for current value
if (count > highestcount)
{
highestvalue = a;
highestcount = count;
}
}
Console.WriteLine("Starting number:" + highestvalue.ToString() + ", terms:" + highestcount.ToString());
I solved the problem some time ago and luckily still have my code. Do not read the code if you don't want a spoiler:
#include <stdio.h>
int lookup[1000000] = { 0 };
unsigned int NextNumber(unsigned int value) {
if ((value % 2) == 0) value >>= 1;
else value = (value * 3) + 1;
return value;
}
int main() {
int i = 0;
int chainlength = 0;
int longest = 0;
int longestchain = 0;
unsigned int value = 0;
for (i = 1; i < 1000000; ++i) {
chainlength = 0;
value = i;
while (value != 1) {
++chainlength;
value = NextNumber(value);
if (value >= 1000000) continue;
if (lookup[value] != 0) {
chainlength += lookup[value];
break;
}
}
lookup[i] = chainlength;
if (longestchain < chainlength) {
longest = i;
longestchain = chainlength;
}
}
printf("\n%d: %d\n", longest, longestchain);
}
time ./a.out
[don't be lazy, run it yourself]: [same here]
real 0m0.106s
user 0m0.094s
sys 0m0.012s
The reason you're stalling is because you pass through a number greater than 2^31-1
(aka INT_MAX
); try using unsigned long long
instead of int
.
I recently blogged about this; note that in C the naive iterative method is more than fast enough. For dynamic languages you may need to optimize by memoizing in order to obey the one minute rule (but this is not the case here).
Oops I did it again (this time examining further possible optimizations using C++).
As has been said, the simplest way is to get some memoization to avoid recomputing things that haven't been computed. You might be interested in knowing that there is no cycle if you being from a number under one million (no cycle has been discovered yet, and people have explored much bigger numbers).
To translate it in code, you can think the python way:
MEMOIZER = dict()
def memo(x, func):
global MEMOIZER
if x in MEMOIZER: return MEMOIZER[x]
r = func(x)
MEMOIZER[x] = r
return r
Memoization is a very generic scheme.
For the Collatze conjecture, you might run in a bit of a pinch because numbers can really grow and therefore you might blow up the available memory.
This is traditionally handled using caching, you only cache the last n
results (tailored to occupy a given amount of memory) and when you already have n
items cached and wish to add a newer one, you discard the older one.
For this conjecture there might be another strategy available, though a bit harder to implement. The basic idea is that you have only ways to reach a given number x
:
2*x
(x-1)/3
Therefore if you cache the results of 2*x
and (x-1)/3
there is no point in caching x
any longer >> it'll never get called anymore (except if you wish to print the sequence at the end... but it's only once). I leave it to you to take advantage of this so that your cache does not grow too much :)
Notice that your brute force solution often computes the same subproblems over and over again. For example, if you start with 10
, you get 5 16 8 4 2 1
; but if you start with 20
, you get 20 10 5 16 8 4 2 1
. If you cache the value at 10
once it's computed, and then won't have to compute it all over again.
(This is known as dynamic programming.)