Finding the best move using MinMax with Alpha-Beta pruning

前端 未结 2 2121
深忆病人
深忆病人 2021-02-09 19:24

I\'m working on an AI for a game and I want to use the MinMax algorithm with the Alpha-Beta pruning.

I have a rough idea on how it works but I\'m still

相关标签:
2条回答
  • 2021-02-09 19:50

    After some research and a lot of time wasted solving this problem, I came up with this solution that seems to work.

    private class MoveValue {
    
        public double returnValue;
        public Move returnMove;
    
        public MoveValue() {
            returnValue = 0;
        }
    
        public MoveValue(double returnValue) {
            this.returnValue = returnValue;
        }
    
        public MoveValue(double returnValue, Move returnMove) {
            this.returnValue = returnValue;
            this.returnMove = returnMove;
        }
    
    }
    
    
    protected MoveValue minMax(double alpha, double beta, int maxDepth, MarbleType player) {       
        if (!canContinue()) {
            return new MoveValue();
        }        
        ArrayList<Move> moves = sortMoves(generateLegalMoves(player));
        Iterator<Move> movesIterator = moves.iterator();
        double value = 0;
        boolean isMaximizer = (player.equals(playerType)); 
        if (maxDepth == 0 || board.isGameOver()) {            
            value = evaluateBoard();            
            return new MoveValue(value);
        }
        MoveValue returnMove;
        MoveValue bestMove = null;
        if (isMaximizer) {           
            while (movesIterator.hasNext()) {
                Move currentMove = movesIterator.next();
                board.applyMove(currentMove);
                returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent());
                board.undoLastMove();
                if ((bestMove == null) || (bestMove.returnValue < returnMove.returnValue)) {
                    bestMove = returnMove;
                    bestMove.returnMove = currentMove;
                }
                if (returnMove.returnValue > alpha) {
                    alpha = returnMove.returnValue;
                    bestMove = returnMove;
                }
                if (beta <= alpha) {
                    bestMove.returnValue = beta;
                    bestMove.returnMove = null;
                    return bestMove; // pruning
                }
            }
            return bestMove;
        } else {
            while (movesIterator.hasNext()) {
                Move currentMove = movesIterator.next();
                board.applyMove(currentMove);
                returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent());
                board.undoLastMove();
                if ((bestMove == null) || (bestMove.returnValue > returnMove.returnValue)) {
                    bestMove = returnMove;
                    bestMove.returnMove = currentMove;
                }
                if (returnMove.returnValue < beta) {
                    beta = returnMove.returnValue;
                    bestMove = returnMove;
                }
                if (beta <= alpha) {
                    bestMove.returnValue = alpha;
                    bestMove.returnMove = null;
                    return bestMove; // pruning
                }
            }
            return bestMove;
        }   
    }
    
    0 讨论(0)
  • 2021-02-09 19:55

    This is a bit diffuclt as the given code is not an actual Java implementation; in order to achieve what you want, there must be concrete types to represent a move and position in the game tree. Usually the the game tree is not explicitly encoded but navigated in a sparse representation where the implementation would actually perform the move in question, evaluate the resulting smaller problem recursively and undo the move, thus using depth-first search by using the call stack so represent the current path.

    To obtain the actual best move, simply return the instance from your method which maximizes the subsequent evaluation. It might be helpful to first implement the Minimax algorithm without alpha-beta-pruning, which is added in a subsequent steps after the basic structure works.

    The implementation from the link in the question (Section 1.5) actually returns the best move, as indicated in the following comment taken from there.

    /** Recursive minimax at level of depth for either
        maximizing or minimizing player.
        Return int[3] of {score, row, col}  */
    

    Here no user-defined type is used to represent the move, but the method returns three values, which are the evaluated best score and the coordinates to which the player would move to actually perform the best move (which the implementation already has done to obtain the score), which are a representation of the actual move.

    0 讨论(0)
提交回复
热议问题