Bitstream of variable-length Huffman codes - How to write to file?

后端 未结 2 1622
小鲜肉
小鲜肉 2021-01-22 13:02

I\'m working on a Huffman coding/decoding project in C and have a good understanding of how the algorithm should store information about the Huffman tree, re-build the tree duri

相关标签:
2条回答
  • 2021-01-22 13:36

    Here's some pseudo-code to give you the general idea:

    static byte BitBuffer = 0;
    static byte BitsinBuffer = 0;
    
    static void WriteBitCharToOutput(char bitChar);
    // buffer one binary digit ('1' or '0')
    {
      if (BitsInBuffer > 7)
      {
        stream.write(BitBuffer);
        BitsInBuffer = 0;
        BitBuffer = 0; // just to be tidy
      }
    
      BitBuffer = (BitBuffer << 1) | (bitChar == '1' ? 1 : 0);
      BitsInBuffer++;
    }
    
    static void FlushBitBuffer()
    // call after last character has been encoded
    // to flush out remaining bits
    {
      if (BitsInBuffer > 0)
      do
      {
        WriteBitCharToOutput('0'); // pad with zeroes
      } while (BitsInBuffer != 1);
    }
    
    0 讨论(0)
  • 2021-01-22 13:53

    As an alternative to the other answer, if you want to write several bits at once to your buffer, you can. It could look something like this: (this is meant to be pseudocode, though it looks fairly real)

    uint32_t buffer = 0;
    int bufbits = 0;
    for (int i = 0; i < symbolCount; i++)
    {
        int s = symbols[i];
        buffer <<= lengths[s];  // make room for the bits
        bufbits += lengths[s];  // buffer got longer
        buffer |= values[s];    // put in the bits corresponding to the symbol
    
        while (bufbits >= 8)    // as long as there is at least a byte in the buffer
        {
            bufbits -= 8;       // forget it's there
            writeByte((buffer >> bufbits) & 0xFF); // and save it
        }
    }
    

    Not shown: obviously you have to save anything left over in the buffer when you're done writing to it.

    This assumes that the maximum code length is 25 or less. The maximum number of bits that can be left in the buffer is 7, 7+25 is the longest thing that fits in a 32 bit integer. This is not a bad limitation, usually the code length is limited to 15 or 16 to allow the simplest form of table-based decoding without needing a huge table.

    0 讨论(0)
提交回复
热议问题