Most people have been encoding the board state, but regarding the moves themselves.. Here's a bit-encoding description.
Bits per piece:
- Piece-ID: Max 4 bits to identify the 16 pieces per side. White/black can be inferred. Have an ordering defined on the pieces. As the number of pieces drops below the respective powers of two, use fewer bits to describe the remaining pieces.
- Pawn: 3 possibilities on the first move, so +2 bits (forward by one or two squares, en passant.) Subsequent moves do not allow moving forward by two, so +1 bit is sufficient. Promotion can be inferred in the decoding process by noting when the pawn has hit the last rank. If the pawn is known to be promoted, the decoder will expect another 2 bits indicating which of the 4 major pieces it has been promoted to.
- Bishop: +1 bit for diagonal used, Up to +4 bits for distance along the diagonal (16 possibilities). The decoder can infer the max possible distance that the piece can move along that diagonal, so if it's a shorter diagonal, use less bits.
- Knight: 8 possible moves, +3 bits
- Rook: +1 bit for horizontal / vertical, +4 bits for distance along the line.
- King: 8 possible moves, +3 bits. Indicate castling with an 'impossible' move -- since castling is only possible while the king is on the first rank, encode this move with an instruction to move the king 'backwards' -- i.e. out of the board.
- Queen: 8 possible directions, +3bits. Up to +4 more bits for distance along the line / diagonal (less if the diagonal is shorter, as in the bishop's case)
Assuming all pieces are on the board, these are the bits per move: Pawn - 6 bits on first move, 5 subsequently. 7 if promoted. Bishop: 9 bits (max), Knight: 7, Rook: 9, King: 7, Queen: 11 (max).