I’m trying to implement a tail-recursive list-sorting function in OCaml, and I’ve come up with the following code:
let tailrec_merge_sort l =
let split l =
The expression
merge (sort left) (sort right)
is not tail-recursive; it calls both (sort left) and (sort right) recursively while there is remaining work in the call (merge).
I think you can fix it with an extra continuation parameter:
let rec sort l k =
match l with
| [] -> k []
| [a] -> k [a]
| list -> let left, right = split list in sort left (fun leftR -> sort right (fun rightR -> k (merge leftR rightR)))
in sort l (fun x -> x)
Merge sort is inherently not tail-recursive. A sort requires two recursive calls, and in any execution of any function, at most one dynamic call can be in tail position. (split
is also called from non-tail position, but since it should use constant stack space that should be OK).
Be sure you are using a recent version of OCaml. In versions 3.08 and older, List.rev
could blow the stack. This problem is fixed in version 3.10. Using version 3.10.2, I can sort a list of ten million elements using your code. It takes a couple of minutes, but I don't blow the stack. So I'm hoping your problem is simply that you have an old version of OCaml.
If not, a good next step would be to set the environment variable
OCAMLRUNPARAM=b=1
which will give a stack trace when you blow the stack.
I'd also like to know the length of the arrays you are sorting; although merge sort cannot be tail-recursive, its non-tail nature should cost you logarithmic stack space.
If you need to sort more than 10 million elements, maybe you should be looking at a different algorithm? Or if you want, you could CPS-convert mergesort by hand, which will yield a tail-recursive version at the cost of allocating contiuations on the heap. But my guess is that it won't be necessary.