问题
I am looking at solution for the set bit count problem (given a binary number, how to efficiently count how many bits are set).
Here, http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive, I have found some methods.
What about the lookup table method? I dont understand what properties of binary representation / number make it work.
static const unsigned char BitsSetTable256[256] =
{
# define B2(n) n, n+1, n+1, n+2
# define B4(n) B2(n), B2(n+1), B2(n+1), B2(n+2)
# define B6(n) B4(n), B4(n+1), B4(n+1), B4(n+2)
B6(0), B6(1), B6(1), B6(2)
};
unsigned int v; // count the number of bits set in 32-bit value v
unsigned int c; // c is the total bits set in v
// Option 1:
c = BitsSetTable256[v & 0xff] +
BitsSetTable256[(v >> 8) & 0xff] +
BitsSetTable256[(v >> 16) & 0xff] +
BitsSetTable256[v >> 24];
// Option 2:
unsigned char * p = (unsigned char *) &v;
c = BitsSetTable256[p[0]] +
BitsSetTable256[p[1]] +
BitsSetTable256[p[2]] +
BitsSetTable256[p[3]];
// To initially generate the table algorithmically:
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
}
In particular, I dont understand the BitsSetTable256
definition at first. Why define these quantities B2, B4,... ? it seems to me that they are not used afterwards.
Could you hint at further doc on binary representation?
Thanks!
回答1:
The definitions are to form the table by patterns. They are recursive macros, B6 uses B4 and B4 uses B2. B6(0) will get broken into:
B4(0), B4(1), B4(1), B4(2)
B4(0) will get broken into:
0, 1, 1, 2
The first few numbers of the sequence will be:
// 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3
As you can see, these are the number of bits set for each index in the table.
The rest of the algorithm is that you are breaking the number into 8-bit chunks and summing the number of bits set in each chunk, that's what these lines are about (you use either option 1 or option 2 at your liking, not both):
// Option 1:
c = BitsSetTable256[v & 0xff] +
BitsSetTable256[(v >> 8) & 0xff] +
BitsSetTable256[(v >> 16) & 0xff] +
BitsSetTable256[v >> 24];
// Option 2:
unsigned char * p = (unsigned char *) &v;
c = BitsSetTable256[p[0]] +
BitsSetTable256[p[1]] +
BitsSetTable256[p[2]] +
BitsSetTable256[p[3]];
The code at the bottom:
// To initially generate the table algorithmically:
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
}
Is a different way of generating the BitsSetTable256. It generates the table at runtime instead of at compile-time (which is what the macro definition does.
P.S. If you're targeting recent enough (SSE4) x86 you can use POPCNT instruction.
来源:https://stackoverflow.com/questions/12684805/hint-for-lookup-table-set-bit-count-algorithm