Haskell Space Overflow

后端 未结 4 958
情深已故
情深已故 2020-12-10 05:51

I\'ve compiled this program and am trying to run it.

import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo

collatzLength :: Int -&g         


        
相关标签:
4条回答
  • 2020-12-10 06:13

    As the author of the code in question, I am now slightly embarrassed because it has not one but two possible stack overflow bugs.

    1. It uses Int. On a 32-bit system this will overflow, as the Collatz sequence can go quite a bit higher than the starting number. This overflow can cause infinite recursion as the function jumps back and forth between negative and positive values.

      In the case of numbers between one and a million, the worst starting point is 704511, which goes as high as 56,991,483,520 before coming back down towards 1. This is well outside the 32-bit range.

    2. It uses maximumBy. This function is not strict, so it will cause a stack overflow when used on long lists. One million elements is more than enough for this to happen with the default stack size. It still works with optimizations enabled, though, due to the strictness analysis performed by GHC.

      The solution is to use a strict version. Since none is available in the standard libraries, we can use the strict left fold ourselves.

    Here is an updated version which should (hopefully) be stack overflow-free, even without optimizations.

    import Data.List
    import Data.Ord
    import qualified Data.MemoCombinators as Memo
    
    collatzLength :: Integer -> Integer
    collatzLength = Memo.arrayRange (1,1000000) collatzLength'
      where
        collatzLength' 1 = 1
        collatzLength' n | odd n  = 1 + collatzLength (3 * n + 1)
                         | even n = 1 + collatzLength (n `quot` 2)
    
    main = print $ foldl1' max $ [(collatzLength n, n) | n <- [1..1000000]]
    
    0 讨论(0)
  • 2020-12-10 06:28

    Use the optimizer (via the -O2 flag) any time you are concerned about performance. GHC's optimizations are hugely important not just to run time but to stack use. I've tested this with GHC 7.2 and optimization takes care of your issue.

    EDIT: In addtion, if you're on a 32 bit machine be sure to use Int64 or Word64. You'll overflow the size of a 32 bit int and cause non-termination otherwise (thanks to Henning for this, upvote his answer).

    0 讨论(0)
  • 2020-12-10 06:32

    I think it's most likely that you're hitting integer overflow with some of the Collatz sequences, and then ending up in an "artificial" cycle that contains overflows but never hits 1. That would produce an infinite recursion.

    Remember that some Collatz sequences get very much larger than their starting number before they finally (?) end up at 1.

    Try to see if it fixes your problem to use Integer instead of Int.

    0 讨论(0)
  • 2020-12-10 06:34

    Here's a shorter program that fails in the same way:

    main = print (maximum [0..1000000])
    

    Yep.

    $ ghc --make harmless.hs && ./harmless
    [1 of 1] Compiling Main             ( harmless.hs, harmless.o )
    Linking harmless ...
    Stack space overflow: current size 8388608 bytes.
    Use `+RTS -Ksize -RTS' to increase it.
    

    With -O2 it works. What do I make of it? I don't know :( These space mysteries are a serious gotcha.

    Edit:

    Thx to hammar for pointing out the culprit.

    Changing your program to use

    maximum' = foldl1' max
    

    Makes it work without -O2. The implementation of Prelude's maximum is lazy and so doesn't quite work for long lists without compiler magic dust.

    0 讨论(0)
提交回复
热议问题