Converting a function to use tail recursion — a formal study

问题

Has anyone written a formal paper describing a method to (automatically) convert functions to be tail recursive? I am looking for a university-level formal treatment including the limitations (types of functions that can be converted), procedures for conversion, and, if possible, proofs of correctness? Examples in Haskell would be a bonus.

回答1:

a method to (automatically) convert functions to be tail recursive?

So there are two parts to this:

recognizing that a given recursive function can be converted to a tail recursive form and making that transformation
implementing tail calls in an efficient way, so the transformation is worth while.

Transforming recursion into tail recursion

It appears relatively straight forward to recognize tail recursive definitions syntactically. After all, 'tail recursive' just means that the call appears syntactically in the tail of the expression.

E.g. the Scheme folks describe:

merely compiling appropriate self-calls as jumps is not suficient to achieve full tail-recursion. Instead, we syntactically divide all sub-expression positions in the source language into two classes: tail (or reduction) position and subproblem position. In the simple expression (if predicate consequent alternative), the predicate is a subproblem position, while both consequent and alternative are in reduction positions. This syntactic notion can be easily extended to arbitrarily nested sub-expressions.

Compiling Higher-Order Languages into Fully Tail-Recursive Portable C

Transforming functions into tail calls

The tricky part of your question is optimizations for identifying and transforming candidate recursive computations into tail recursive ones.

One reference is in GHC, which uses inlining and a wide array of simplification rules to collapse away recursive calls, until their underlying tail recursive structure remains.

Secrets of the GHC inliner

Tail Call Elimination

Once you have your functions in a tail-recursive form, you'd like to have that implemented effiicently. If you can generate a loop, that's a good start. If your target machine doesn't, then the tail call elimination" will need a few tricks. To quote Odersky and Schinz, cited below,

Various techniques have been proposed over the years to eliminate general (and not only recursive) tail calls, mostly for compilers targeting C.

... put the whole program in a big function and to simulate function calls using direct jumps or switch statements inside this function.

... A popular technique is to use a trampoline. A trampoline is an outer function which repeatedly calls an inner function. Each time the inner function wishes to tail call another function, it does not call it directly but simply returns its identity (e.g. as a closure) to the trampoline, which then does the call itself. Unlimited stack growth is thus avoided, but some performance is inevitably lost, mostly because all the calls made by the trampoline are calls to statically unknown functions.

Another technique is Henry Baker’s “Cheney on the M.T.A.” With his technique, the program first has to be converted to continuation passing style (CPS), therefore turning all calls into tail calls, and can then be compiled as usual. At run-time, when the stack is about to overflow, the current continuation is built and returned (or longjmped) to the trampoline “waiting” at the bottom of the call stack.

Tail call elimination on the Java Virtual Machine, Michel Schinz Martin Odersky, 2001
Henry G. Baker, Jr. CONS should not CONS its arguments, part II: Cheney on the M. T. A. Draft Version, January 1994.

回答2:

Mercury contains a couple of optimizations for automatically making things tail-recursive. (Mercury is an enforce-purity logic programming language, so it talks about predicates rather than functions, but many of the same ideas apply to Mercury programs as to Haskell ones. A much bigger difference than it being logical rather than functional is that it is strict rather than lazy)

"Accumulator introduction" generates specialised versions of predicates with an extra accumulator parameter in order to allow associative operations to be moved before the recursive call. Apparently this optimisation doesn't necessarily result in tail-recursive predicates on its own, but often results in a form which can be optimised by the second optimisation:

"Last call modulo constructors" essentially allows a recursive call that is followed only by constructor applications to be rewritten such that the value is constructed first containing a "hole" and then the recursive call returns its output directly into the memory address of the "hole" rather than using the normal return-value-passing convention. I believe Haskell would get this optimisation for free simply due to laziness, however.

Both of these optimisations are described in the paper Making Mercury programs tail recursive.

来源：https://stackoverflow.com/questions/10700807/converting-a-function-to-use-tail-recursion-a-formal-study

标签

compiler-construction

functional-programming

tail-recursion