How exactly does minimax recursion work?

China☆狼群 提交于 2019-11-29 15:19:58

问题


So I was looking up Mini-max for a Tic-Tac-Toe Game, but couldn't understand how the recursion worked? Okay, so basically here are my questions:

  1. How does minimax know whose turn is it? Whats the best way to indicate the player whose turn it is generating?
  2. How do you generate possible moves?
  3. How do you know when you are at a terminal node, and how do you generate the terminal nodes?

For example in this Pseudo-code

function integer minimax(node, depth)
if node is a terminal node or depth <= 0:
    return the heuristic value of node
α = -∞
for child in node: # evaluation is identical for both players
    α = max(α, -minimax(child, depth-1))
return α

A node is a board correct? And is the depth how many plies the code has to go down in recursion? Also what is the max function and where are the nodes being generated from?

Now, so far I have this code for creating a board:

class Board{
    public:
        Board();
        ~Board(){};
    public: // The board
        // In the board, 1 is x, 2 is o, 0 is empty square.
        int board[3][3];
};

But how would I know whose turn is it? And how do I generate the child nodes for the board?


回答1:


We'll use your tic-tac-toe as an example first.

  • A minimax algorithm works best for games where players alternate turns, but can be adapted to games where players may make multiple moves per turn. We'll assume the former, for simplicity. In that case, you need not store 'X to move' or 'O to move' with each node, because that can just be determined by the parity of the node depth (whether I'm an even number of steps, or an odd number of steps, from the top).
  • Generating possible moves from each position requires that you know whose move it is (which can be determined as before), and the rules for legal moves from a particular position. For a simple game like tic-tac-toe, given a position, it suffices to enumerate all the states that consist of a copy of the current position plus a new piece, belonging to the current player, placed at each empty square in turn. For games like Othello, you must also check each placement to ensure that it follows the rules, and update the final position according to the consequences of the rule (for Othello, flipping the colors of a bunch of pieces). In general, from each valid position you're tracking, you enumerate all the possible placings of a new piece and check to see which ones are allowed by the ruleset.
  • In general, you NEVER generate the entire tree, since game tree sizes can easily exceed the storage capacity of Earth. You always set a maximum depth of iteration. A terminal node, then, is simply a node at the maximum depth, or a node from which no legal moves exist (for tic-tac-toe, a board with every square filled). You don't generate the terminal nodes beforehand; they get generated naturally during game tree construction. Tic-tac-toe is simple enough that you can generate the entire game tree, but then don't try to use your tic-tac-toe code for e.g. Othello.

Looking at your pseudocode:

  • max(a, b) is any function that returns the larger of a or b. This is usually provided by a math library or similar.
  • The depth is the maximum depth to which you will search.
  • The heuristic value you're computing is some numerical value that describes the value of the board. For a game like tic-tac-toe, which is simple enough that you CAN enumerate the entire game tree, you can designate 1 for a board position that wins for the player doing the analysis, -1 for a board position that wins for the other player, and 0 for any inconclusive position. In general, you'll have to cook up a heuristic yourself, or use a well-accepted one.
  • You generate the nodes on the fly during your analysis based on their parent nodes. Your root node is always the position from which you're doing analysis.

If you haven't worked with graphs or trees yet, I suggest you do so first; the tree primitive, in particular, is essential to this problem.


As an answer to a comment in this thread asking for an example of determining whose turn it is for a given node, I offer this pseudo-Python:

who_started_first = None

class TreeNode:
    def __init__(self, board_position = EMPTY_BOARD, depth = 0):
        self.board_position = board_position
        self.children = []
        self.depth = depth
    def construct_children(self, max_depth):
        # call this only ONCE per node!
        # even better, modify this so it can only ever be called once per node
        if max_depth > 0:

            ### Here's the code you're actually interested in.
            if who_started_first == COMPUTER:
                to_move = (COMPUTER if self.depth % 2 == 0 else HUMAN)
            elif who_started_first == HUMAN:
                to_move = (HUMAN if self.depth % 2 == 0 else COMPUTER)
            else:
                raise ValueError('who_started_first invalid!')

            for position in self.board_position.generate_all(to_move):
                # That just meant that we generated all the valid moves from the
                # currently stored position. Now we go through them, and...
                new_node = TreeNode(position, self.depth + 1)
                self.children.append(new_node)
                new_node.construct_children(max_depth - 1)

Each node is capable of keeping track of its absolute depth from the 'root' node. When we try to determine how we should generate board positions for the next move, we check to see whose move it is based on the parity of our depth (the result of self.depth % 2) and our record of who moved first.




回答2:


1) How does minimax know whose turn is it? Whats the best way to indicate the player whose turn it is generating?

You have that depth argument. If the depth is even, then it's one player's turn, if it's odd, then it's the other player's turn.

2) How do you generate possible moves?

Using the rules of the game. In tic tac toe, a possible move means placing one's mark into a free cell.

3) How do you know when you are at a terminal node, and how do you generate the terminal nodes?

A terminal node is a node where someone has won. You generate them by recursion. Each recursive call should be given the current state of the board. I guess that's the node and child parameters in your pseudocode. So if in that situation someone has won, then it's terminal, otherwise you try all legal moves and recurse.




回答3:


I can provide a bit of an idea as to what you are looking for, since I wrote a minimax algorithm for tic-tac-toe.

To answer your questions directly:

  1. My minimax algorithm didn't determine that. It accepted an argument that determined which player the algorithm was using.

  2. Knowing the player to move, loop through all blank squares on the board, and for each one, generate a node with the current player's token in that square. Recursively proceed from there.

  3. I used a function that returned a value that indicated whether the game was over, and whether it was a draw or a win.

My basic algorithm did this:

  • Input: the player to move, and the state of the board.
  • Find all blank spaces left on the board.
    • Generate a new board with the player's move in that space.
    • If the game is over, generate a node with the result of the game.
    • Otherwise, run the algorithm, passing in the other player and the new board, and generate a node with the result of the opponent's ideal move.
  • Determine which node (move) leads to the best possible worst case.
  • Output: The best move, and information about the game's result from it.


来源:https://stackoverflow.com/questions/11703846/how-exactly-does-minimax-recursion-work

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!