问题
I'm trying to encode and decode for a Huffman coding in C++. I'm not sure where my problem is I'm able to read and write but when I decompress the file its scrambled so I'm either not encoding or decoding correctly. I think its when I'm writing and reading the file where things go wrong. So this is what I have to write the encoded file. First I store all the bitcodes from my unordered map called uMap into one string:
int i = 0, j = 0;
string fullStr = "";
for (i = 0; i < buffsize; i++) //put all codes in one string of 1's and 0's
fullStr += uMap[buffer[i]];
unsigned char byte = 0;
i = 0;
for (j = 0; j < fullStr.length(); j++)
{
if (i != 8)
{
byte |= (fullStr[j] == '1') << i; // make up one byte
i++;
}
else
{
outf.put(byte); // write one byte at a time
byte = 0;
i = 0;
}
}
if (i != 0 && i < 8)
{
while (i<8)
{
byte |= 0 << i; // finish up last byte if not finished
i++;
}
outf.put(byte);
}
Then on the decompress side:
int i = 0;
unsigned char byte = 0;
bitset<8> setByte;
ofstream outf(filename, ofstream::binary);
string concat = "";
string bitStr = "";
for (i = 0; i < buffLength; i++)
{
setByte = buffer[i];
bitStr = setByte.to_string();
for (int j = 0; j < 8; j++)
{
concat += bitStr[j];
if (uMap[concat])
{
//cout << "found code " << concat << " " << uMap[concat] << endl;
outf.put(uMap[concat]);
concat = "";
}
}
}
outf.close();
回答1:
The bits are being unpacked in reverse order to the packing, perhaps because you are using a different method for each. The first bit packed is going into bit 0 of the bitset (bitvalue << 0). The first bit unpacked is coming from bit 7, because setByte.to_string()
creates a string with bit 7 at index 0. The question was tagged as Huffman coding but has nothing to do with it, it's about bitstream manipulation.
回答2:
Thank you Weather Vane you were correct about the read they were being reversed so I took care of that but it turns out it was also how i was writing it too.
New compress:
int i = 0, j = 0;
string fullStr = "";
for (i = 0; i < buffsize; i++) //put all codes in one string
fullStr += uMap[buffer[i]];
for (i = 0; i < fullStr.length(); i+=8)
{
unsigned char byte = 0;
string str8 = "";
if (i + 8 < fullStr.length())
str8 = fullStr.substr(i, i + 8);
else
str8 = fullStr.substr(i, fullStr.length());
for (unsigned b = 0; b != 8; ++b)
{
if (b < str8.length())
byte |= (str8[b] & 1) << b; // this line was wrong before
else
byte |= 1 << b;
}
outf.put(byte);
}
int filelen = outf.tellp();
outf.close();
New decompress:
int i = 0,j=0,k=0;
unsigned char byte = 0;
bitset<8> setByte, reverseByte;
ofstream outf(filename, ofstream::binary);
string concat = "";
string bitStr = "";
string reverse = "";
int charCount = 0;
for (i = 0; i < buffLength; i++)
{
setByte = buffer[i];
bitStr = setByte.to_string();
reverse = "";
for (k = 7; k>=0; k--)
reverse += bitStr[k];
for (j = 0; j < 8; j++)
{
concat += reverse[j];
if (uMap[concat])
{
outf << uMap[concat];
charCount++;
concat = "";
if (charCount == origLength) // if we have written original amount stop
{
outf.close();
return 1;
}
}
}
}
outf.close();
return 1;
来源:https://stackoverflow.com/questions/26442093/reading-and-writing-bit-by-bit-in-c-for-huffman-encoding