GHC Optimization: Collatz conjecture

后端 未结 2 1854
夕颜
夕颜 2021-01-11 13:32

I\'ve written code for the Project Euler\'s Challenge 14, in both Haskell and C++ (ideone links). They both remember any calculations they have previously done in an array.

相关标签:
2条回答
  • 2021-01-11 14:04

    Some problems with your (mutable array) code:

    • You use a fold to find the maximal chain length, for that the array has to be converted to an association list, that takes time and allocation the C++ version doesn't need.
    • You use even and div for testing resp dividing by 2. These are slow. g++ optimises both operations to the faster bit operations (on platforms where that is supposedly faster, at least), but GHC doesn't do these low-level optimisations (yet), so for the time being, they have to be done by hand.
    • You use readArray and writeArray. The extra bounds-checking that isn't done in the C++ code also takes time, once the other problems are dealt with, that amounts to a significant portion of the running time (ca. 25% on my box), since there are done a lot of reads and writes in the algorithm.

    Incorporating that into the implementation, I get

    import Data.Array.ST
    import Data.Array.Base
    import Control.Monad.ST
    import Data.Bits
    
    collatz_array :: ST s (STUArray s Int Int)
    collatz_array = do
        let upper = 10000000
        arr <- newArray (0,upper) 0
        unsafeWrite arr 2 1
        let check i
                | upper < i = return arr
                | i .&. 1 == 0 = do
                    l <- unsafeRead arr (i `shiftR` 1)
                    unsafeWrite arr i (l+1)
                    check (i+1)
                | otherwise = do
                    let j = (3*i+1) `shiftR` 1
                        find k l
                            | upper < k = find (next k) $! l+1
                            | k < i     = do
                                m <- unsafeRead arr k
                                return (m+l)
                            | otherwise = do
                                m <- unsafeRead arr k
                                if m == 0
                                  then do
                                      n <- find (next k) 1
                                      unsafeWrite arr k n
                                      return (n+l)
                                  else return (m+l)
                              where
                                next h
                                    | h .&. 1 == 0 = h `shiftR` 1
                                    | otherwise = (3*h+1) `shiftR` 1
                    l <- find j 1
                    unsafeWrite arr i l
                    check (i+1)
        check 3
    
    collatz_max :: ST s (Int,Int)
    collatz_max = do
        car <- collatz_array
        (_,upper) <- getBounds car
        let find w m i
                | upper < i = return (w,m)
                | otherwise = do
                    l <- unsafeRead car i
                    if m < l
                      then find i l (i+1)
                      else find w m (i+1)
        find 1 0 2
    
    main :: IO ()
    main = print (runST collatz_max)
    

    And the timings (both for 10 million):

    $ time ./cccoll
    8400511 429
    
    real    0m0.210s
    user    0m0.200s
    sys     0m0.009s
    $ time ./stcoll
    (8400511,429)
    
    real    0m0.341s
    user    0m0.307s
    sys     0m0.033s
    

    which doesn't look too bad.

    Important note: That code only works on 64-bit GHC (so, in particular, on Windows, you need ghc-7.6.1 or later, previous GHCs were 32-bit even on 64-bit Windows) since intermediate chain elements exceed 32-bit range. On 32-bit systems, one would have to use Integer or a 64-bit integer type (Int64 or Word64) for following the chains, at a drastic performance cost, since the primitive 64-bit operations (arithmetic and shifts) are implemented as foreign calls to C functions in 32-bit GHCs (fast foreign calls, but still much slower than direct machine ops).

    0 讨论(0)
  • 2021-01-11 14:15

    The ideone site is using a ghc 6.8.2, which is getting pretty old. On ghc version 7.4.1, the difference is much smaller.

    With ghc:

    $ ghc -O2 euler14.hs && time ./euler14
    (837799,329)
    ./euler14  0.63s user 0.04s system 98% cpu 0.685 total
    

    With g++ 4.7.0:

    $ g++ --std=c++0x -O3 euler14.cpp && time ./a.out
    8400511 429
    ./a.out  0.24s user 0.01s system 99% cpu 0.252 total
    

    For me, the ghc version is only 2.7 times slower than the c++ version. Also, the two programs aren't giving the same result... (not a good sign, especially for benchmarking)

    0 讨论(0)
提交回复
热议问题