Using minimax search for card games with imperfect information

后端未结

关注

 2  2109

I want to use minimax search (with alpha-beta pruning), or rather negamax search, to make a computer program play a card game.

The card game actually consists of 4 playe

相关标签:

2条回答

爱一瞬间的悲伤

2021-02-15 09:59
Minimax search as you've implemented it is the wrong approach for games where there is so much uncertainty. Since you don't know the card distribution among the other players, your search will spend an exponential amount of time exploring games that could not happen given the actual distribution of the cards.

I think a better approach would be to start with good rules for play when you have little or no information about the other players' hands. Things like:
1. If you play first in a round, play your lowest card since you have little chance of winning the round.
2. If you play last in a round, play your lowest card that will win the round. If you can't win the round, then play your lowest card.
Have your program initially not bother with search and just play by these rules and have it assume that all the other players will use these heuristics as well. As the program observes what cards the first and last players of each round play it can build up a table of information about the cards each player likely holds. E.g. a 9 would have won this round, but player 3 didn't play it so he must not have any cards 9 or higher. As information is gathered about each player's hand the search space will eventually be constrained to the point where a minimax search of possible games could produce useful information about the next card to play.
0 讨论(0)
发布评论:

提交评论
- 加载中...
一个人的身影

2021-02-15 10:14

I want to clarify details that the accepted answer doesn't really go into.

In many card games you can sample the unknown cards that your opponent could have instead of generating all of them. You can take into account information like short suits and the probability of holding certain cards given play so far when doing this sampling to weight the likelihood of each possible hand (each hand is a possible world that we'll solve independently). Then, you solve each hand using perfect information search. The best move over all of these worlds is often the best move overall - with some caveat.

In games like Poker this won't work very well -- the game is all about the hidden information. You have to precisely balance your actions to keep the information about your hand hidden.

But, in games like trick-based card games, this works pretty well - particularly since new information is being revealed all the time. Really good players have a good idea what everyone holds anyway. So, reasonably strong Skat and Bridge programs have been based on these ideas.

If you can completely solve the underlying world, that is best, but if you can't, you can use minimax or UCT to choose the best move in each world. There are also hybrid algorithms (ISMCTS) that try to mix this process together. Be careful about the claims here. Simple sampling approaches are easier to code -- you should try the simpler approach before a more complex one.

Here are some research papers that will give some more information on when the sampling approach to imperfect information has worked well:

Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search (This paper analyzes when the sampling approach is likely to work.)

Improving State Evaluation, Inference, and Search in Trick-Based Card Games (This paper describes the use of sampling in Skat)

Imperfect information in a computationally challenging game (This paper describes sampling in Bridge)

Information Set Monte Carlo Tree Search (This paper merges sampling and UCT/Monte Carlo Tree Search to avoid the issues in the first reference.)

The problem with rule-based approaches in the accepted answer is that they can't take advantage of computational resources beyond that required to create the initial rules. Furthermore, rule-based approaches will be limited by the power of the rules that you can write. Search-based approaches can use the power of combinatorial search to produce much stronger play than the author of the program.

0 讨论(0)
发布评论:

提交评论
- 加载中...