Reading and writing bit by bit in C++ for Huffman Encoding

一曲冷凌霜 提交于 2019-12-25 03:44:47


I'm trying to encode and decode for a Huffman coding in C++. I'm not sure where my problem is I'm able to read and write but when I decompress the file its scrambled so I'm either not encoding or decoding correctly. I think its when I'm writing and reading the file where things go wrong. So this is what I have to write the encoded file. First I store all the bitcodes from my unordered map called uMap into one string:

int i = 0, j = 0;
string fullStr = "";
for (i = 0; i < buffsize; i++) //put all codes in one string of 1's and 0's
    fullStr += uMap[buffer[i]];
unsigned char byte = 0;
i = 0;
for (j = 0; j < fullStr.length(); j++)

    if (i != 8)
        byte |= (fullStr[j] == '1') << i; // make up one byte
        outf.put(byte); // write one byte at a time
        byte = 0;
        i = 0;
if (i != 0 && i < 8)
    while (i<8)
        byte |= 0 << i; // finish up last byte if not finished

Then on the decompress side:

int i = 0;
unsigned char byte = 0;
bitset<8> setByte;
ofstream outf(filename, ofstream::binary);
string concat = "";
string bitStr = "";
for (i = 0; i < buffLength; i++)
    setByte = buffer[i];
    bitStr = setByte.to_string();
    for (int j = 0; j < 8; j++)
        concat += bitStr[j];
        if (uMap[concat])
            //cout << "found code " << concat << " " << uMap[concat] << endl;
            concat = "";


The bits are being unpacked in reverse order to the packing, perhaps because you are using a different method for each. The first bit packed is going into bit 0 of the bitset (bitvalue << 0). The first bit unpacked is coming from bit 7, because setByte.to_string() creates a string with bit 7 at index 0. The question was tagged as Huffman coding but has nothing to do with it, it's about bitstream manipulation.


Thank you Weather Vane you were correct about the read they were being reversed so I took care of that but it turns out it was also how i was writing it too.

New compress:

int i = 0, j = 0;
string fullStr = "";
for (i = 0; i < buffsize; i++) //put all codes in one string
    fullStr += uMap[buffer[i]];
for (i = 0; i < fullStr.length(); i+=8)
    unsigned char byte = 0;
    string str8 = "";
    if (i + 8 < fullStr.length())
        str8 = fullStr.substr(i, i + 8);
        str8 = fullStr.substr(i, fullStr.length());
    for (unsigned b = 0; b != 8; ++b)
        if (b < str8.length())
            byte |= (str8[b] & 1) << b; // this line was wrong before
            byte |= 1 << b;
int filelen = outf.tellp();

New decompress:

int i = 0,j=0,k=0;
unsigned char byte = 0;
bitset<8> setByte, reverseByte;
ofstream outf(filename, ofstream::binary);
string concat = "";
string bitStr = "";
string reverse = "";
int charCount = 0;
for (i = 0; i < buffLength; i++)
    setByte = buffer[i];
    bitStr = setByte.to_string();
    reverse = "";
    for (k = 7; k>=0; k--)
        reverse += bitStr[k];
    for (j = 0; j < 8; j++)
        concat += reverse[j];
        if (uMap[concat])
            outf << uMap[concat];
            concat = "";
            if (charCount == origLength) // if we have written original amount stop
                return 1;

return 1;

