Finding the best move using MinMax with Alpha-Beta pruning

若如初见. 提交于 2019-12-03 07:47:12

After some research and a lot of time wasted solving this problem, I came up with this solution that seems to work.

private class MoveValue {

    public double returnValue;
    public Move returnMove;

    public MoveValue() {
        returnValue = 0;
    }

    public MoveValue(double returnValue) {
        this.returnValue = returnValue;
    }

    public MoveValue(double returnValue, Move returnMove) {
        this.returnValue = returnValue;
        this.returnMove = returnMove;
    }

}


protected MoveValue minMax(double alpha, double beta, int maxDepth, MarbleType player) {       
    if (!canContinue()) {
        return new MoveValue();
    }        
    ArrayList<Move> moves = sortMoves(generateLegalMoves(player));
    Iterator<Move> movesIterator = moves.iterator();
    double value = 0;
    boolean isMaximizer = (player.equals(playerType)); 
    if (maxDepth == 0 || board.isGameOver()) {            
        value = evaluateBoard();            
        return new MoveValue(value);
    }
    MoveValue returnMove;
    MoveValue bestMove = null;
    if (isMaximizer) {           
        while (movesIterator.hasNext()) {
            Move currentMove = movesIterator.next();
            board.applyMove(currentMove);
            returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent());
            board.undoLastMove();
            if ((bestMove == null) || (bestMove.returnValue < returnMove.returnValue)) {
                bestMove = returnMove;
                bestMove.returnMove = currentMove;
            }
            if (returnMove.returnValue > alpha) {
                alpha = returnMove.returnValue;
                bestMove = returnMove;
            }
            if (beta <= alpha) {
                bestMove.returnValue = beta;
                bestMove.returnMove = null;
                return bestMove; // pruning
            }
        }
        return bestMove;
    } else {
        while (movesIterator.hasNext()) {
            Move currentMove = movesIterator.next();
            board.applyMove(currentMove);
            returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent());
            board.undoLastMove();
            if ((bestMove == null) || (bestMove.returnValue > returnMove.returnValue)) {
                bestMove = returnMove;
                bestMove.returnMove = currentMove;
            }
            if (returnMove.returnValue < beta) {
                beta = returnMove.returnValue;
                bestMove = returnMove;
            }
            if (beta <= alpha) {
                bestMove.returnValue = alpha;
                bestMove.returnMove = null;
                return bestMove; // pruning
            }
        }
        return bestMove;
    }   
}

This is a bit diffuclt as the given code is not an actual Java implementation; in order to achieve what you want, there must be concrete types to represent a move and position in the game tree. Usually the the game tree is not explicitly encoded but navigated in a sparse representation where the implementation would actually perform the move in question, evaluate the resulting smaller problem recursively and undo the move, thus using depth-first search by using the call stack so represent the current path.

To obtain the actual best move, simply return the instance from your method which maximizes the subsequent evaluation. It might be helpful to first implement the Minimax algorithm without alpha-beta-pruning, which is added in a subsequent steps after the basic structure works.

The implementation from the link in the question (Section 1.5) actually returns the best move, as indicated in the following comment taken from there.

/** Recursive minimax at level of depth for either
    maximizing or minimizing player.
    Return int[3] of {score, row, col}  */

Here no user-defined type is used to represent the move, but the method returns three values, which are the evaluated best score and the coordinates to which the player would move to actually perform the best move (which the implementation already has done to obtain the score), which are a representation of the actual move.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!