I want to multiply a 8x8 binary matrix represented as a unsigned 64 bit integer by a 8 bit vector represented by a unsigned char. However, due to some other issues
With this matrix and vector representation, it helps to do matrix multiplication this way:
(col1 ... col8) * (v1 ... v8)T = col1 * v1 + ... + col8 * v8
where matrix A = (col1 ... col8)
and column vector v = (v1 ... v8)T
Thinking this further, you can do all multiplications at once if you inflate the 8-bit vector to a 64-bit vector by repeating every bit 8 times and then calculating P = A & v_inflated
. The only thing left then, is the addition (i.e. XOR) of the products.
A simple approach for XORing the products is.
uint64_t P = calculated products from text above;
uint64_t sum = 0;
for( int i = 8; i; --i )
{
sum ^= P & 0xFF;
P >> 8;
}
You ONLY HAVE 256 vectors! Use lookup tables to generate the right bitmasks, then your logic will be something like
output_bit_n = bool (matrix [n] & lookup [vector])
In other words, your lookup table can transpose an 8-bit value into the 64-bit world.
You can efficiently pack this into the result with rotate-with-carry instructions if the compiler isn't smart enough to optimise (value<<=1)|=result
.