Programmer Puzzle: Encoding a chess board state throughout a game

前端 未结 30 1529
闹比i
闹比i 2021-01-29 17:16

Not strictly a question, more of a puzzle...

Over the years, I\'ve been involved in a few technical interviews of new employees. Other than asking the standard \"do you

相关标签:
30条回答
  • 2021-01-29 17:49

    The position on a board can be defined in 7 bits (0-63 , and 1 value specifying it is not on the board anymore). So for every piece on the board specify where it is located.

    32 pieces * 7 bits = 224 bits

    EDIT: as Cadrian pointed out... we also have the 'promoted pawn to queen' case. I suggest we add extra bits at the end to indicate which pawn have been promoted.

    So for every pawn which has been promoted we follow the 224 bits with 5 bits which indicate the index of the pawn which has been promoted, and 11111 if it is the end of the list.

    So minimal case (no promotions) is 224 bits + 5 (no promotions). For every promoted pawn add 5 bits.

    EDIT: As shaggy frog points out, we need yet another bit at the end to indicate whose turn it is ;^)

    0 讨论(0)
  • 2021-01-29 17:50

    Attacking a subproblem of encoding the steps after an initial position has been encoded. The approach is to create a "linked list" of steps.

    Each step in the game is encoded as the "old position->new position" pair. You know the initial position in the beginning of the chess game; by traversing the linked list of steps, you can get to the state after X moves.

    For encoding each step, you need 64 values to encode the starting position (6 bits for 64 squares on the board - 8x8 squares), and 6 bits for the end position. 16 bits for 1 move of each side.

    Amount of space that encoding a given game would take is then proportionate to the number of moves:

    10 x (number of white moves + number of black moves) bits.

    UPDATE: potential complication with promoted pawns. Need to be able to state what the pawn is promoted to - may need special bits (would use gray code for this to save space, as pawn promotion is extremely rare).

    UPDATE 2: You don't have to encode the end position's full coordinates. In most cases, the piece that's being moved can move to no more than X places. For example, a pawn can have a maximum of 3 move options at any given point. By realizing that maximum number of moves for each piece type, we can save bits on the encoding of the "destination".

    Pawn: 
       - 2 options for movement (e2e3 or e2e4) + 2 options for taking = 4 options to encode
       - 12 options for promotions - 4 promotions (knight, biship, rook, queen) times 3 squares (because you can take a piece on the last row and promote the pawn at the same time)
       - Total of 16 options, 4 bits
    Knight: 8 options, 3 bits
    Bishop: 4 bits
    Rook: 4 bits
    King: 3 bits
    Queen: 5 bits
    

    So the spatial complexity per move of black or white becomes

    6 bits for the initial position + (variable number of bits based upon the type of the thing that's moved).

    0 讨论(0)
  • 2021-01-29 17:52

    It's best just to store chess games in a human-readable, standard format.

    The Portable Game Notation assumes a standard starting position (although it doesn't have to) and just lists the moves, turn by turn. A compact, human-readable, standard format.

    E.g.

    [Event "F/S Return Match"]
    [Site "Belgrade, Serbia Yugoslavia|JUG"]
    [Date "1992.11.04"]
    [Round "29"]
    [White "Fischer, Robert J."]
    [Black "Spassky, Boris V."]
    [Result "1/2-1/2"]
    
    1. e4 e5 2. Nf3 Nc6 3. Bb5 {This opening is called the Ruy Lopez.} 3... a6
    4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 d6 8. c3 O-O 9. h3 Nb8  10. d4 Nbd7
    11. c4 c6 12. cxb5 axb5 13. Nc3 Bb7 14. Bg5 b4 15. Nb1 h6 16. Bh4 c5 17. dxe5
    Nxe4 18. Bxe7 Qxe7 19. exd6 Qf6 20. Nbd2 Nxd6 21. Nc4 Nxc4 22. Bxc4 Nb6
    23. Ne5 Rae8 24. Bxf7+ Rxf7 25. Nxf7 Rxe1+ 26. Qxe1 Kxf7 27. Qe3 Qg5 28. Qxg5
    hxg5 29. b3 Ke6 30. a3 Kd6 31. axb4 cxb4 32. Ra5 Nd5 33. f3 Bc8 34. Kf2 Bf5
    35. Ra7 g6 36. Ra6+ Kc5 37. Ke1 Nf4 38. g3 Nxh3 39. Kd2 Kb5 40. Rd6 Kc5 41. Ra6
    Nf2 42. g4 Bd3 43. Re6 1/2-1/2
    

    If you want to make it smaller, then just zip it. Job done!

    0 讨论(0)
  • 2021-01-29 17:52

    [edited after reading the question properly] If you assume every legal position can be reached from the initial position (which is a possible definition of "legal"), then any position can be expressed as the sequence of moves from the start. A snippet of play starting from a non-standard position can be expressed as the sequence of moves needed to reach the start, a switch to turn the camera on, followed by subsequent moves.

    So let's call the initial board state the single bit "0".

    Moves from any position can be enumerated by numbering the squares and ordering the moves by (start, end), with the conventional 2 square jump indicating castling. There is no need to encode illegal moves, because the board position and the rules are always already known. The flag to turn the camera on could either be expressed as a special in-band move, or more sensibly as an out-of-band move number.

    There are 24 opening moves for either side, which can fit in 5 bits each. Subsequent moves might require more or less bits, but the legal moves are always enumerable, so the width of each move can happily grow or expand. I have not calculated, but I imagine 7 bit positions would be rare.

    Using this system, a 100 half-move game could be encoded in approximately 500 bits. However, it might be wise to use an opening book. Suppose it contains a million sequences. Let then, an initial 0 indicate a start from the standard board, and a 1 followed by a 20 bit number indicate a start from that opening sequence. Games with somewhat conventional openings might be shortened by say 20 half-moves, or 100 bits.

    This is not the greatest possible compression, but (without the opening book) it would be very easy to implement if you already have a chess model, which the question assumes.

    To further compress, you'd want to order the moves according to likelihood rather than in an arbitrary order, and encode the likely sequences in fewer bits (using e.g. Huffman tokens as people have mentioned).

    0 讨论(0)
  • 2021-01-29 17:52

    cletus' answer is good, but he forgot to also encode whose turn it is. It is part of the current state, and is necessary if you're using that state to drive a search algorithm (like an alpha-beta derivative).

    I'm not a chess player, but I believe there's also one more corner case: how many moves have been repeated. Once each player makes the same move three times, the game is a draw, no? If so, then you'd need to save that information in the state because after the third repeat, the state is now terminal.

    0 讨论(0)
  • 2021-01-29 17:52

    Algorithm should deterministically enumerate all possible destinations at each move. Number of destinations:

    • 2 bishops, 13 destinations each = 26
    • 2 rooks, 14 destinations each = 28
    • 2 knights, 8 destinations each = 16
    • queen, 27 destinations
    • king , 8 destinations

    8 paws could all become queens in worst (enumeration-wise) case, thus making largest number of possible destinations 9*27+26+28+16+8=321. Thus, all destinations for any move can be enumerated by a 9 bit number.

    Maximum number of moves of both parties is 100 (if i'm not wrong, not a chess player). Thus any game could be recorded in 900 bits. Plus initial position each piece can be recorded using 6 bit numbers, which totals to 32*6 =192 bits. Plus one bit for "who moves first" record. Thus, any game can be recorded using 900+192+1=1093 bits.

    0 讨论(0)
提交回复
热议问题