Attempting to use continuation passing style to avoid stack overflow with minimax algorithm

问题

Summary of my objective: Figure out how to use continuation-passing style to avoid a stack overflow when using an algorithm I believe cannot be made tail-recursive. Alternatively, find a way to make the function tail-recursive.

Details: I am new to F# (and functional programming in general) and I am attempting to implement the minimax algorithm with alpha-beta pruning. This is an algorithm used to determine the best possible move for a two-player game. The pseudocode for the algorithm can be found here: https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning

This is a resource I've found useful for understanding the flow of the algorithm: http://inst.eecs.berkeley.edu/~cs61b/fa14/ta-materials/apps/ab_tree_practice/

One difference in my implementation is that the players for the game I'm working on do not always alternate. For this reason, I've removed one of the parameters from the function. My implementation is below:

let rec minimax node depth alpha beta =
    if depth = 0 || nodeIsTerminal node then
        heuristicValue node
    else
        match node.PlayerToMakeNextChoice with
        | PlayerOneMakesNextChoice ->
            takeMax (getChildren node) depth alpha beta    
        | PlayerTwoMakesNextChoice ->
            takeMin (getChildren node) depth alpha beta
and takeMax children depth alpha beta =      
    match children with
    | [] -> alpha
    | firstChild :: remainingChildren ->
        let newAlpha = [alpha; minimax firstChild (depth - 1) alpha beta] |> List.max

        if beta < newAlpha then newAlpha
        else takeMax remainingChildren depth newAlpha beta
and takeMin children depth alpha beta =      
    match children with
    | [] -> beta
    | firstChild :: remainingChildren ->
        let newBeta = [beta; minimax firstChild (depth - 1) alpha beta] |> List.min

        if newBeta < alpha then newBeta
        else takeMax remainingChildren depth alpha newBeta

The problem I'm having is that while takeMax and takeMin are tail-recursive, these methods call minimax when assigning newAlpha and newBeta so it is still likely to produce a stack overflow when I call minimax with a large depth. I have done some research and found that using continuation-passing style is a potential way to use the heap rather than the stack when the function cannot be made tail-recursive (and I believe after many hours of trying that this cannot). While I can understand some very basic examples, I'm having trouble understanding how I would apply the concept to this situation; I'd be very grateful if someone could help walk me through it.

Edit 1: My best understanding of the solution

let minimax node depth alpha beta =
    let rec recurse node depth alpha beta k =
        if depth = 0 || nodeIsTerminal node then
            k (heuristicValue node)
        else
            match node.PlayerToMakeNextChoice with
            | PlayerOneMakesNextChoice ->
                takeMax (getChildren node) depth alpha beta k
            | PlayerTwoMakesNextChoice ->
                takeMin (getChildren node) depth alpha beta k
    and takeMax children depth alpha beta k =      
        match children with
        | [] -> k alpha
        | firstChild :: remainingChildren ->
            let continuation = fun minimaxResult ->
                let newAlpha = [alpha; minimaxResult] |> List.max

                if beta < newAlpha then k newAlpha
                else takeMax remainingChildren depth newAlpha beta k

            recurse firstChild (depth - 1) alpha beta continuation
    and takeMin children depth alpha beta k =      
        match children with
        | [] -> k beta
        | firstChild :: remainingChildren ->
            let continuation = fun minimaxResult ->
                let newBeta = [beta; minimaxResult] |> List.min

                if newBeta < alpha then k newBeta
                else takeMax remainingChildren depth alpha newBeta k

            recurse firstChild (depth - 1) alpha beta continuation
    recurse node depth alpha beta id

回答1:

As you have undoubtedly seen in the "basic examples", the general idea is to take one extra parameter (the "continuation", usually denoted k), which is a function, and every time you would return a value, pass that value to the continuation instead. So, for example, to modify minimax this way, we'd get:

let rec minimax node depth alpha beta k =
    if depth = 0 || nodeIsTerminal node then
        k (heuristicValue node)
    else
        match node.PlayerToMakeNextChoice with
        | PlayerOneMakesNextChoice ->
            k (takeMax (getChildren node) depth alpha beta)
        | PlayerTwoMakesNextChoice ->
            k (takeMin (getChildren node) depth alpha beta)

And then the call site needs to be "turned inside out", so to speak, so instead of something like this:

let a = minimax ...
let b = f a
let c = g b
c

we would write something like this:

minimax ... (fun a ->
   let b = f a
   let c = g b
   c
)

See? a used to be a return value of minimax, but now a is the parameter of the continuation that is passed to minimax. The runtime mechanics is that, once minimax has finished running, it would call this continuation, and its result value would show up there as parameter a.

So, to apply this to your real code, we'd get this:

| firstChild :: remainingChildren ->
    minimax firstChild (depth - 1) alpha beta (fun minimaxResult ->
        let newAlpha = [alpha; minimaxResult] |> List.max

        if beta < newAlpha then newAlpha
        else takeMax remainingChildren depth newAlpha beta
    )

Ok, this is all well and good, but this is only half the job: we have rewritten minimax in CPS, but takeMin and takeMax are still recursive. Not good.

So let's do takeMax first. Same idea: add an extra parameter k and every time we would "return" a value, pass it to k instead:

and takeMax children depth alpha beta k =      
    match children with
    | [] -> k alpha
    | firstChild :: remainingChildren ->
        minimax firstChild (depth - 1) alpha beta (fun minimaxResult ->
            let newAlpha = [alpha; minimaxResult] |> List.max

            if beta < newAlpha then k newAlpha
            else takeMax remainingChildren depth newAlpha beta k
        )

And now, of course, I have to modify the call site correspondingly:

let minimax ... k =
    ...
    match node.PlayerToMakeNextChoice with
    | PlayerOneMakesNextChoice ->
        takeMax (getChildren node) depth alpha beta k

Wait, what just happened? I just said that every time I return a value I should pass it to k instead, but here I'm not doing it. Instead, I'm passing k itself to takeMax. Huh?

Well, the rule that "instead of returning pass to k" is only the first part of the approach. The second part is - "on every recursive call, pass k down the chain". This way, the original top-level k would travel down the whole recursive chain of calls, and be ultimately called by whatever function decides to stop the recursion.

Keep in mind that, although CPS helps with stack overflow, it does not free you from memory limits in general. All those intermediate values don't go on the stack anymore, but they have to go somewhere. In this case, every time we construct that lambda fun minimaxResult -> ..., that's a heap allocation. So all your intermediate values go on the heap.

There is a nice symmetry though: if the algorithm was truly tail-recursive, you'd be able to pass the original top-level continuation down the call chain without allocating any intermediate lambdas, and so you wouldn't require any heap memory.

来源：https://stackoverflow.com/questions/49565361/attempting-to-use-continuation-passing-style-to-avoid-stack-overflow-with-minima

标签

functional-programming

stack-overflow

minimax

continuation-passing