Not strictly a question, more of a puzzle...
Over the years, I\'ve been involved in a few technical interviews of new employees. Other than asking the standard \"do you
There are 64 possible board positions, so you need 6 bits per position. There are 32 initial pieces, so we have 192 bits total so far, where every 6 bits indicates the position of the given piece. We can pre-determine the order the pieces appear in, so we don't have to say which is which.
What if a piece is off the board? Well, we can place a piece on the same spot as another piece to indicate that it is off the board, since that would be illegal otherwise. But we also don't know whether the first piece will be on the board or not. So we add 5 bits indicating which piece is the first one (32 possibilities = 5 bits to represent the first piece). Then we can use that spot for subsequent pieces that are off the board. That brings us to 197 bits total. There has to be at least one piece on the board, so that will work.
Then we need one bit for whose turn it is - brings us to 198 bits.
What about pawn promotion? We can do it a bad way by adding 3 bits per pawn, adding on 42 bits. But then we can notice that most of the time, pawns aren't promoted.
So, for every pawn that is on the board, the bit '0' indicates it is not promoted. If a pawn is not on the board then we don't need a bit at all. Then we can use variable length bit strings for which promotion he has. The most often it will be a queen, so "10" can mean QUEEN. Then "110" means rook, "1110" means bishop, and "1111" means knight.
The initial state will take 198 + 16 = 214 bits, since all 16 pawns are on the board and unpromoted. An end-game with two promoted pawn-queens might take something like 198 + 4 + 4, meaning 4 pawns alive and not promoted and 2 queen pawns, for 206 bits total. Seems pretty robust!
===
Huffman encoding, as others have pointed out, would be the next step. If you observe a few million games, you'll notice each piece is much more likely to be on certain squares. For example, most of the time, the pawns stay in a straight line, or one to the left / one to the right. The king will usually stick around the home base.
Therefore, devise a Huffman encoding scheme for each separate position. Pawns will likely only take on average 3-4 bits instead of 6. The king should take few bits as well.
Also in this scheme, include "taken" as a possible position. This can very robustly handle castling as well - each rook and king will have an extra "original position, moved" state. You can also encode en passant in the pawns this way - "original position, can en passant".
With enough data this approach should yield really good results.