What's the reason of 'let rec' for impure functional language OCaml?

前端 未结 6 1768
梦谈多话
梦谈多话 2021-02-02 06:54

In the book Real World OCaml, the authors put why OCaml uses let rec for defining recursive functions.

OCaml distinguishes between nonrecurs

相关标签:
6条回答
  • 2021-02-02 07:34

    I am not an expert, but I'll make a guess until the truly knowledgable guys show up. In OCaml there can be side effects that happen during the definition of a function:

    let rec f =
        let () = Printf.printf "hello\n" in
        fun x -> if x <= 0 then 12 else 1 + f (x - 1)
    

    This means that the order of function definitions must be preserved in some sense. Now imagine that two distinct sets of mutually recursive functions are interleaved. It doesn't seem at all easy for the compiler to preserve the order while processing them as two separate mutually recursive sets of definitions.

    The use of `let rec ... and`` means that distinct sets of mutually recursive function definitions can't be interleaved in OCaml as they can in Haskell. Haskell doesn't have side effects (in some sense), so definitions can be freely reordered.

    0 讨论(0)
  • 2021-02-02 07:40

    I'd say that in OCaml they are trying to make REPL and source files work the same way. So, it's perfectly reasonable to redefine some function in REPL; therefore, they have to allow it in the source as well. Now, if you use the (redefined) function in itself, OCaml needs some way of knowing which of the definitions to use: the previous one or the new one.

    In Haskell they've just gave up and accepted that REPL works differentyle from source files.

    0 讨论(0)
  • It's not a question of purity, it's a question of specifying what environment the typechecker should check an expression in. It actually gives you more power than you would have otherwise. For example (I'm going to write Standard ML here because I know it better than OCaml, but I believe the typechecking process is pretty much the same for the two languages), it lets you distinguish between these cases:

    val foo : int = 5
    val foo = fn (x) => if x = foo then 0 else 1
    

    Now as of the second redefinition, foo has the type int -> int. On the other hand,

    val foo : int = 5
    val rec foo = fn (x) => if x = foo then 0 else 1
    

    does not typecheck, because the rec means that the typechecker has already decided that foo has been rebound to the type 'a -> int, and when it tries to figure out what that 'a needs to be, there is a unification failure because x = foo forces foo to have a numeric type, which it doesn't.

    It can certainly "look" more imperative, because the case without rec allows you to do things like this:

    val foo : int = 5
    val foo = foo + 1
    val foo = foo + 1
    

    and now foo has the value 7. That's not because it's been mutated, however --- the name foo has been rebound 3 times, and it just so happens that each of those bindings shadowed a previous binding of a variable named foo. It's the same as this:

    val foo : int = 5
    val foo' = foo + 1
    val foo'' = foo' + 1
    

    except that foo and foo' are no longer available in the environment after the identifier foo has been rebound. The following are also legal:

    val foo : int = 5
    val foo : real = 5.0
    

    which makes it clearer that what's happening is shadowing of the original definition, rather than a side effect.

    Whether or not it's stylistically a good idea to rebind identifiers is questionable -- it can get confusing. It can be useful in some situations (e.g. rebinding a function name to a version of itself that prints debugging output).

    0 讨论(0)
  • 2021-02-02 07:51

    I think this has nothing to do with being purely functional, it is just a design decision that in Haskell you are not allowed to do

    let a = 0;;
    let a = a + 1;;
    

    whereas you can do it in Caml.

    In Haskell this code won't work because let a = a + 1 is interpreted as a recursive definition and will not terminate. In Haskell you don't have to specify that a definition is recursive simply because you can't create a non-recursive one (so the keyword rec is everywhere but is not written).

    0 讨论(0)
  • 2021-02-02 07:53

    What are the technical reasons that enforces let rec while pure functional languages not?

    Recursiveness is a strange beast. It has a relation to purity, but it's a little more oblique than this. To be clear, you could write "alterna-Haskell" which retains its purity, its laziness but does not have recursively bound lets by default and demands some kind of rec marker just as OCaml does. Some would even prefer this.


    In essence, there are just many different kinds of "let"s possible. If we compare let and let rec in OCaml we'll see a small difference. In static formal semantics, we might write

    Γ ⊢ E : A    Γ, x : A ⊢ F : B
    -----------------------------
       Γ ⊢ let x = E in F : B
    

    which says that if we can prove in a variable environment Γ that E has type A and if we can prove in the same variable environment Γ augmented with x : A that F : B then we can prove that in the variable environment Γ let x = E in F has type B.

    The thing to watch is the Γ argument. This is just a list of ("variable name", "value") pairs like [(x, 3); (y, "hello")] and augmenting the list like Γ, x : A just means consing (x, A) on to it (sorry that the syntax is flipped).

    In particular, let's write the same formalism for let rec

    Γ, x : A ⊢ E : A    Γ, x : A ⊢ F : B
    -------------------------------------
           Γ ⊢ let rec x = E in F : B
    

    In particular, the only difference is that neither of our premises work in the plain Γ environment; both are allowed to assume the existence of the x variable.

    In this sense, let and let rec are simply different beasts.


    So what does it mean to be pure? At the strictest definition, of which Haskell doesn't even participate, we must eliminate all effects including non-termination. The only way to achieve this is to pull away our ability to write unrestricted recursion and replace it only carefully.

    There exist plenty of languages without recursion. Perhaps the most important one is the Simply Typed Lambda Calculus. In it's basic form it is regular lambda calculus but augmented with a typing discipline where types are bit like

    type ty =
      | Base
      | Arr of ty * ty
    

    It turns out that STLC cannot represent recursion---the Y combinator, and all other fixed-point cousin combinators, cannot be typed. Thusly, STLC is not Turing Complete.

    It is however uncompromisingly pure. It achieves that purity with the bluntest of instruments, however, by completely outlawing recursion. What we'd really like is some kind of balanced, careful recursion which doesn't lead to non-termination---we'll still be Turing Incomplete, but not so crippled.

    Some languages try this game. There are clever ways of adding typed recursion back along a division between data and codata which ensures that you cannot write non-terminating functions. If you're interested, I suggest learning a bit of Coq.


    But OCaml's goal (and Haskell's as well) is not to be delicate here. Both languages are uncompromisingly Turing Complete (and therefore "practical"). So let's discuss some more blunt ways of augmenting the STLC with recursion.

    The bluntest of the bunch is to add a single built-in function called fix

    val fix : ('a -> 'a) -> 'a
    

    or, in more genuine OCaml-y notation which requires eta-expansion

    val fix : (('a -> 'b) -> ('a -> 'b)) -> ('a -> 'b)
    

    Now, remember that we're only considering a primitive STLC with fix added. We can indeed write fix (the latter one at least) in OCaml, but that's cheating at the moment. What does fix buy the STLC as a primitive?

    It turns out that the answer is: "everything". STLC + Fix (basically a language called PCF) is impure and Turing Complete. It's also simply tremendously difficult to use.


    So this is the final hurdle to jump: how do we make fix easier to work with? By adding recursive bindings!

    Already, STLC has a let construction. You can think of it as just syntax sugar:

    let x = E in F   ---->   (fun x -> F) (E)
    

    but once we've added fix we also have the power to introduce let rec bindings

    let rec x a = E in F ----> (fun x -> F) (fix (fun x a -> E))
    

    At this point it should again be clear: let and let rec are very different beasts. They embody different levels of linguistic power and let rec is a window to allow fundamental impurity through Turing Completeness and its partner-effect non-termination.


    So, at the end of the day, it's a little amusing that Haskell, the purer of the two languages, made the interesting choice of abolishing plain let bindings. That's really the only difference: there is no syntax for representing a non-recursive binding in Haskell.

    At this point it's essentially just a style decision. The authors of Haskell determined that recursive bindings were so useful that one might as well assume that every binding is recursive (and mutually so, a can of worms ignored in this answer so far).

    On the other hand, OCaml gives you to ability to be totally explicit about the kind of binding you choose, let or let rec!

    0 讨论(0)
  • 2021-02-02 07:56

    When you define a semantics of function definition, as a language designer, you have choices: either to make the name of the function visible in the scope of its own body, or not. Both choices are perfectly legal, for example C-family languages being far from functional, still do have names of definitions visible in their scope (this also extends to all definitions in C, making this int x = x + 1 legal). OCaml language decides to give us extra flexibility of making the choice by ourselves. And that's really great. They decided to make it invisible by default, a fairly descent solution, since most of the functions that we write are non recursive.

    What concerning the cite, it doesn't really correspond to the function definitions – the most common use of the rec keyword. It is mostly about "Why the scope of function definition doesn't extend to the body of the module". This is a completely different question. After some research I've found a very similar question, that has an answer, that might satisfy you, a cite from it:

    So, given that the type checker needs to know about which sets of definitions are mutually recursive, what can it do? One possibility is to simply do a dependency analysis on all the definitions in a scope, and reorder them into the smallest possible groups. Haskell actually does this, but in languages like F# (and OCaml and SML) which have unrestricted side-effects, this is a bad idea because it might reorder the side-effects too. So instead it asks the user to explicitly mark which definitions are mutually recursive, and thus by extension where generalization should occur.

    Even without any reordering, with arbitrary non-pure expressions, that can occur in the function definition (a side effect of definition, not evaluation) it is impossible to build the dependency graph. Consider demarshaling and executing function from file.

    To summarize, we have two usages of let rec construct, one is to create a self recursive function, like

     let rec seq acc = function
        | 0 -> acc
        | n -> seq (acc+1) (n-1)
    

    Another is to define mutually recursive functions:

    let rec odd n =
      if n = 0 then true
      else if n = 1 then false else even (n - 1)
    and even n =
      if n = 0 then false
      else if n = 1 then true else odd (n - 1)
    

    At the first case, there is no technical reasons to stick to one or to another solution. This is just a matter of taste.

    The second case is harder. When inferring type you need to split all function definitions into clusters consisting of mutually depending definitions, in order to narrow typing environment. In OCaml it is harder to make, since you need to take into account side-effects. (Or you can continue without splitting it into principal components, but this will lead to another issue – your type system will be more restrictive, i.e., will disallow more valid programs).

    But, revisiting the original question and the quote from RWO, I'm still pretty sure that there is no technical reasons for adding the rec flag. Consider, SML that has the same problems, but still has rec enabled by default. There is a technical reason, for let ... and ... syntax for defining a set of mutual recursive functions. In SML this syntax doesn't require us to put the rec flag, in OCaml does, thus giving us more flexibility, like the ability to swap to values with let x = y and y = x expression.

    0 讨论(0)
提交回复
热议问题