I\'m working on a Huffman coding/decoding project in C and have a good understanding of how the algorithm should store information about the Huffman tree, re-build the tree duri
Here's some pseudo-code to give you the general idea:
static byte BitBuffer = 0;
static byte BitsinBuffer = 0;
static void WriteBitCharToOutput(char bitChar);
// buffer one binary digit ('1' or '0')
{
if (BitsInBuffer > 7)
{
stream.write(BitBuffer);
BitsInBuffer = 0;
BitBuffer = 0; // just to be tidy
}
BitBuffer = (BitBuffer << 1) | (bitChar == '1' ? 1 : 0);
BitsInBuffer++;
}
static void FlushBitBuffer()
// call after last character has been encoded
// to flush out remaining bits
{
if (BitsInBuffer > 0)
do
{
WriteBitCharToOutput('0'); // pad with zeroes
} while (BitsInBuffer != 1);
}
As an alternative to the other answer, if you want to write several bits at once to your buffer, you can. It could look something like this: (this is meant to be pseudocode, though it looks fairly real)
uint32_t buffer = 0;
int bufbits = 0;
for (int i = 0; i < symbolCount; i++)
{
int s = symbols[i];
buffer <<= lengths[s]; // make room for the bits
bufbits += lengths[s]; // buffer got longer
buffer |= values[s]; // put in the bits corresponding to the symbol
while (bufbits >= 8) // as long as there is at least a byte in the buffer
{
bufbits -= 8; // forget it's there
writeByte((buffer >> bufbits) & 0xFF); // and save it
}
}
Not shown: obviously you have to save anything left over in the buffer when you're done writing to it.
This assumes that the maximum code length is 25 or less. The maximum number of bits that can be left in the buffer is 7, 7+25 is the longest thing that fits in a 32 bit integer. This is not a bad limitation, usually the code length is limited to 15 or 16 to allow the simplest form of table-based decoding without needing a huge table.