I need C code to return the number of 1\'s in an unsigned char in C. I need an explanation as to why it works if it\'s not obvious. I\'ve found a lot of code for a 32-bit nu
base on Ephemient's post, we have the no branched 8 bits version. It is in hexadecimal expression.
typedef unsigned char UINT8;
typedef unsigned short UINT16;
typedef unsigned long long UINT64;
int hammingWeight8( const UINT8& c)
{
return ( c* 0x8040201ULL & 0x11111111)%0xF;
}
Apply it twice, we have a 16bits version, which needs 9 operations.
int hammingWeight16( const UINT16& c)
{
return ((c & 0xFF)* 0x8040201ULL & 0x11111111)%0xF +
((c >> 8)* 0x8040201ULL & 0x11111111)%0xF;
}
Here I write a variant 16bits version which needs 64bits registers and 11 operations. It seems not better than the previous one, but it just uses 1 modulo operation.
int hammingWeight16( const UINT16& c)
{
UINT64 w;
w= (((( c* 0x8000400020001ULL)>> 3) & 0x1111111111111111)+14)%0xF;
return (c!=0)*(w+1+(c==0xFFFF)*15);
}
HACKMEM has this algorithm in 3 operations (roughly translated to C):
bits = (c * 01001001001ULL & 042104210421ULL) % 017;
(ULL
is to force 64-bit arithmetic. It's needed, just barely... this calculation requires 33-bit integers.)
Actually, you can replace the second constant with 042104210021ULL
, since you're only counting 8 bits, but it doesn't look as nicely symmetrical.
How does this work? Think of c
bit-wise, and remember that (a + b) % c = (a % c + b % c) % c
, and (a | b) == a + b
iff (a & b) == 0
.
(c * 01001001001ULL & 042104210421ULL) % 017
01 01001001001 01 1
02 02002002002 02000000000 1
04 04004004004 04000000 1
010 010010010010 010000 1
020 020020020020 020 1
040 040040040040 040000000000 1 # 040000000000 == 2 ** 32
0100 0100100100100 0100000000 1
0200 0200200200200 0200000 1
If you don't have 64-bit arithmetic available, you can split c
up into nibbles and do each half, taking 9 operations. This only requires 13 bits, so using 16- or 32-bit arithmetic will work.
bits = ((c & 017) * 0421 & 0111) % 7 + ((c >> 4) * 0421 & 0111) % 7;
(c * 0421 & 01111) % 7
1 0421 01 1
2 01042 01000 1
4 02104 0100 1
8 04210 010 1
For example, if c == 105 == 0b11001001
,
c == 0100
| 040
| 010
| 01 == 0151
* 01001001001001ULL == 0100100100100
| 040040040040
| 010010010010
| 01001001001 == 0151151151151
& 0421042104210421ULL == 0100000000
| 04000000000
| 010000
| 01 == 04100010001
% 017 == 4
c & 017 == 8 | 1 == 011
011 * 0421 == 8 * 0421 | 1 * 0421 == 04210 | 0421 == 04631
04631 & 0111 == 04210 & 0111 | 0421 & 0111 == 010 | 01 == 011
011 % 7 == 2
c >> 4 == 4 | 2 == 06
06 * 0421 == 4 * 0421 | 2 * 0421 == 02104 | 01042 == 03146
03146 & 0111 == 02104 & 0111 | 01042 & 0111 == 0100 | 01000 == 01100
01100 % 7 == 2
2 + 2 == 4
an unsigned char is a "number" in just the same way that a 32-bit float or integer is a "number", what the compiler deems them to represent is what changes.
if you picture a char as its bits:
01010011 (8 bits);
you can count the set bits by doing the following:
take the value, lets say x, and take x % 2, the remainder will be either 1 or 0. that is, depending on the endianness of the char, the left or right most bit. accumulate the remainder in a separate variable (this will be the resulting number of set bits).
then >> (right shift) 1 bit.
repeat until 8 bits have been shifted.
the c code should be pretty simple to implement from my pseudocode, but basically
public static int CountSetBits(char c)
{
int x = 0;
int setBits = 0;
while (x < 7)
{
setBits = setBits + c % 2;
c = c >> 1;
x = x + 1;
}
}
As already answered, the standard ways of counting bits also work on unsigned chars.
Example:
unsigned char value = 91;
int bitCount = 0;
while(value > 0)
{
if ( value & 1 == 1 )
bitCount++;
value >>= 1;
}
For a integer as small as an unsigned char you get best performance using a small lookup-table.
I know what population-count algorithms you're mentioning. They work by doing arithmetic of multiple words smaller than an integer stored in a register.
This technique is called SWAR (http://en.wikipedia.org/wiki/SWAR).
For more information I suggest you check out the hackers delight website: www.hackersdelight.org. He has example code and written a book that explains these tricks in detail.
The same code will work for an unsigned char. Loop over all bits testing them. See this.