My task is to check (>trillions checks), does two int contain any of predefined pairs of nibbles (first pair 0x2 0x7; second 0xd 0x8). For example:
bit offset:
A table-based approach could be:
static inline int has_zeros (uint32_t X)
{
int H = (X >> 16);
int L = X & 0xFFFF;
return (ztmap[H>>3]&(1<<(H&7))) ||
(ztmap[L>>3]&(1<<(L&7)));
}
static inline int nibble_check (uint32_t A, uint32_t B)
__attribute__((always_inline))
{
return has_zeros((A ^ 0xDDDDDDDDU)|(B ^ 0x88888888U)) ||
has_zeros((A ^ 0x22222222U)|(B ^ 0x77777777U));
}
One idea is to precompute a map of 65536 values that checks if a 16-bit number contains the nibble 0000
. I used a bit table in my example but may be a byte table could be faster even if bigger and less cache-friendly.
When you have a table check you can then xor the first 32-bit integer with a repeated first nibble, and the second integer with a repeated second nibble. When the first nibble is present in the first integer we'll get a zero and the same will happen on the second integer for the second nibble. Or-ing the two results a zero is only possible if the pair being searched is present.
The search is then completed by repeating it for the other pair of nibble values.
Note however that for a king-king attack in a regular chess game (i.e. where only two kings are present) then in my opinion doing a check using coordinates could be a lot faster than this.