I\'m currently working on project euler problem 14.
I solved it using a poorly coded program, without memoization, that took 386 5 seconds to run (s
Well, the C program uses unsigned long
, but Integer
can store arbitrarily large integers (it's a bignum). If you import Data.Word, then you can use Word
, which is a machine-word-sized unsigned integer.
After replacing Integer
with Word
, and using ghc -O2
and gcc -O3
, the C program runs in 0.72 seconds, while the Haskell programs runs in 1.92 seconds. 2.6x isn't bad. However, ghc -O2
doesn't always help, and this is one of the programs on which it doesn't! Using just -O
, as you did, brings the runtime down to 1.90 seconds.
I tried replacing div
with quot
(which uses the same type of division as C; they only differ on negative inputs), but strangely it actually made the Haskell program run slightly slower for me.
You should be able to speed up the syr
function with the help of this previous Stack Overflow question I answered about the same Project Euler problem.
After having compiled it with optimisations, there are still several differences to the C programme
div
, while the C programme uses machine division (which truncates) [but any self-respecting C compiler transforms that into a shift, so that makes it yet faster], that would be quot
in Haskell; that reduced the run time by some 15% here.Integer
s. If you have 64-bit Int
s in your GHC (64-bit OS other than Windows), replace Integer
with Int
. That reduced the run time by a factor of about 3 here. If you're on a 32-bit system, you're out of luck, GHC doesn't use native 64-bit instructions there, these operations are implemented as C calls, that's still rather slow.For the memoisation, you can outsource it to one of the memoisation packages on hackage, the only one that I remember is data-memocombinators, but there are others. Or you can do it yourself, for example keeping a map of previously calculated values - that would work best in the State
monad,
import Control.Monad.State.Strict
import qualified Data.Map as Map
import Data.Map (Map, singleton)
type Memo = Map Integer Int
syr :: Integer -> State Memo Int
syr n = do
mb <- gets (Map.lookup n)
case mb of
Just l -> return l
Nothing -> do
let m = if even n then n `quot` 2 else 3*n+1
l <- syr m
let l' = l+1
modify (Map.insert n l')
return l'
solve :: Integer -> Int -> Integer -> State Memo (Integer,Int)
solve maxi len start
| len > 1000000 = return (maxi,len)
| otherwise = do
l <- syr start
if len < l
then solve start l (start+1)
else solve maxi len (start+1)
p14 :: (Integer,Int)
p14 = evalState (solve 0 0 500000) (singleton 1 1)
but that will probably not gain too much (not even when you've added the necessary strictness). The trouble is that a lookup in a Map
is not too cheap and an insertion is relatively expensive.
Another method is to keep a mutable array for the lookup. The code becomes more complicated, since you have to choose a reasonable upper bound for the values to cache (should be not much larger than the bound for the starting values) and deal with the parts of the sequences falling outside the memoised range. But an array lookup and write are fast. If you have 64-bit Int
s, the below code runs pretty fast, here it takes 0.03s for a limit of 1 million, and 0.33s for a limit of 10 million, the corresponding (as closely as I reasonably could) C code runs in 0.018 resp. 0.2s.
module Main (main) where
import System.Environment (getArgs)
import Data.Array.ST
import Data.Array.Base
import Control.Monad.ST
import Data.Bits
import Data.Int
main :: IO ()
main = do
args <- getArgs
let bd = case args of
a:_ -> read a
_ -> 100000
print $ collMax bd
next :: Int -> Int
next n
| n .&. 1 == 0 = n `unsafeShiftR` 1
| otherwise = 3*n + 1
collMax :: Int -> (Int,Int16)
collMax upper = runST $ do
arr <- newArray (0,upper) 0 :: ST s (STUArray s Int Int16)
let go l m
| upper < m = go (l+1) $ next m
| otherwise = do
l' <- unsafeRead arr m
case l' of
0 -> do
l'' <- go 1 $ next m
unsafeWrite arr m (l'' + 1)
return (l+l'')
_ -> return (l+l'-1)
collect mi ml i
| upper < i = return (mi, ml)
| otherwise = do
l <- go 1 i
if l > ml
then collect i l (i+1)
else collect mi ml (i+1)
unsafeWrite arr 1 1
collect 1 1 2
On my current system (32-bit Core2Duo) your Haskell code, including all the optimizations given in the answers, takes 0.8s
to compile and 1.2s
to run.
You could transfer the run-time to compile-time, and reduce the run-time to nearly zero.
module Euler14 where
import Data.Word
import Language.Haskell.TH
terms :: Word -> Word
terms n = countTerms n 0
where
countTerms 1 acc = acc + 1
countTerms n acc | even n = countTerms (n `div` 2) (acc + 1)
| otherwise = countTerms (3 * n + 1) (acc + 1)
longestT :: Word -> Word -> (Word, Word)
longestT mi mx = find mi mx (0, 0)
where
find mi mx (ct,cn) | mi == mx = if ct > terms mi then (ct,cn) else (terms mi, mi)
| otherwise = find (mi + 1) mx
(if ct > terms mi then (ct,cn) else (terms mi, mi))
longest :: Word -> Word -> ExpQ
longest mi mx = return $ TupE [LitE (IntegerL (fromIntegral a)),
LitE (IntegerL (fromIntegral b))]
where
(a,b) = longestT mi mx
and
{-# LANGUAGE TemplateHaskell #-}
import Euler14
main = print $(longest 500000 999999)
On my system it takes 2.3s
to compile this but the run-time goes down to 0.003s
. Compile Time Function Execution (CTFE) is something you can't do in C/C++. The only other programming language that I know of that supports CTFE is the D programming language. And just to be complete, the C code takes 0.1s
to compile and 0.7s
to run.