Funky haskell lazy list implicit recursion

前端 未结 3 2114
花落未央
花落未央 2021-02-08 05:58

In Haskell, you can build infinite lists due to laziness:

Prelude> let g = 4 : g
Prelude> g !! 0
4
Prelude> take 10 g
[4,4,4,4,4,4,4,4,4,4]
相关标签:
3条回答
  • 2021-02-08 06:32

    "Doesn't exist yet" is not quite right. We try not to think about when values are exist — denotational programming is about eternal unchanging values and equations.

    More concretely, this code is fine:

    Prelude> let x = [(x !! 1) + 1, 3] in x
    [4,3]
    

    You might expect that x !! 1 doesn't exist yet. But here's how an implementation like GHC works.

    When building the list f, it constructs an in-memory object (a "thunk") to represent the expression (x !! 1) + 1. No evaluation of x has occurred yet. It wraps a pointer to this thunk, plus a pointer to 3, into a linked list and hands it off to GHCi's implicit show.

    Now, the show instance for lists has to show the elements one by one. show will demand ("force") the evaluation of the thunk for (x !! 1) + 1, which causes that code to be "entered". By the definition of (+), forcing (x !! 1) + 1 forces x !! 1, which in turn forces two things:

    • the spine of the list (its cons cells) through the second cell, and
    • the value of the second element (the "car" of the second cell, in Lisp terms), because (+) wants to operate on that value.

    And the second value is present by now — it's 3. If it were another thunk we'd force that, and so on. See this blog post for an interesting view on self-referential containers.

    Now, how does the compiled GHC code detect an infinite loop in your other example? When we enter a thunk, we need to remember to come back later and overwrite it with the final value. This is what's meant specifically by "lazy evaluation" as opposed to "non-strict semantics", and it prevents us from duplicating work.

    Anyway, as an optimization when entering a thunk, GHC's runtime will first overwrite it with another object called a "blackhole". We'll come back later and overwrite the blackhole with the final value. But what happens if we enter a blackhole before doing so? That means that evaluating x requires first evaluating x, an unresolvable loop. So entering a blackhole throws an exception.

    0 讨论(0)
  • 2021-02-08 06:39

    In this case, a picture may tell a thousand words.

    First, remember how cons (the (:) list constructor) works. It's a pair of two things: an element, and a reference to a list tail (which is either another cons, or []).

    As you should know, when you say [1, 2, 3], it's just a shortcut for (1:(2:(3:[]))) or 1:2:3:[]. If you visualize each cons pair as a box with two slots, this expression look like:

    ┌───┬──┐  ┌───┬──┐  ┌───┬──┐  ┌────┐
    │ 1 │ ─┼─>│ 2 │ ─┼─>│ 3 │ ─┼─>│ [] │
    └───┴──┘  └───┴──┘  └───┴──┘  └────┘
    

    One loop

    When you say g = 4 : g, you're not really building an "infinite" list, you're building a circular list: g is defined as a cons whose tail reference simply points back to g itself:

    ┌──────────┐
    │ ┌───┬──┐ │
    └>│ 4 │ ─┼─┘
      └───┴──┘
    

    This actually has nothing to do with laziness, and everything with self-reference: for example, you can do the same thing in (eager) Common Lisp using syntax like '#1=(4 . #1#) (where the #1 is like g).

    Whether you say g !! 0, or g !! 1000000000000, g never grows: (!!) simply runs around the loop in-place, for as many times as you told it to, until it exhausts itself and returns the element, 4.

    Two loops

    When you say f = (f !! 10) : f, the same thing happens—except now, the element slot contains a different expression than 4:

    ┌──────────┐
    │ ┌───┬──┐ │
    └>│ ╷ │ ─┼─┘
      └─┼─┴──┘
        │
        │ ┌───────────┐
        └>│ (f !! 10) │
          └───────────┘
    

    Crucially, this sub-expression also refers back to f, just like the tail does:

     ┌──────────┐
     │ ┌───┬──┐ │
    ┌┴>│ ╷ │ ─┼─┘
    │  └─┼─┴──┘
    │    │
    │    │ ┌───────────┐
    │    └>│ (f !! 10) │
    │      └──┼────────┘
    └─────────┘
    

    So, when you ask for f !! n, (!!) will first run around the top loop n times, then return the element, as it did for g. However, instead of escaping the loop, (f !! 10) just re-enters it, and the process repeats itself: looping 10 times around the top, then once around the bottom, and back.

    0 讨论(0)
  • 2021-02-08 06:46

    You are correct as to why it hangs--you've created a circular dependency that it can't resolve. Computing the current element requires a later element that can't be computed until the current one is computed, blah blah, around in circles we go.

    As for why it doesn't produce an exception, try compiling it instead of running it in GHCi:

    $ ghc --make Loop.hs
    $ ./Loop.exe
    Loop.exe: <<loop>>
    

    I'm assuming this is a NonTermination exception. Pff, Halting Problem? Ha.

    Not everything works the way you'd like or expect it to when done in GHCi. If something seems odd, try compiling a small example and see if it makes more sense that way. GHCi using different rules for type defaulting trips me up sometimes, for instance.

    0 讨论(0)
提交回复
热议问题