Bit Array to String and back to Bit Array

牧云@^-^@ 提交于 2019-12-12 18:18:50

问题


Possible Duplicate Converting byte array to string and back again in C#

I am using Huffman Coding for compression and decompression of some text from here

The code in there builds a huffman tree to use it for encoding and decoding. Everything works fine when I use the code directly.

For my situation, i need to get the compressed content, store it and decompress it when ever need.

The output from the encoder and the input to the decoder are BitArray.

When I tried convert this BitArray to String and back to BitArray and decode it using the following code, I get a weird answer.

Tree huffmanTree = new Tree();
huffmanTree.Build(input);

string input = Console.ReadLine();
BitArray encoded = huffmanTree.Encode(input);

// Print the bits
Console.Write("Encoded Bits: ");
foreach (bool bit in encoded)
{
    Console.Write((bit ? 1 : 0) + "");
}
Console.WriteLine();

// Convert the bit array to bytes
Byte[] e = new Byte[(encoded.Length / 8 + (encoded.Length % 8 == 0 ? 0 : 1))];
encoded.CopyTo(e, 0);

// Convert the bytes to string
string output = Encoding.UTF8.GetString(e);

// Convert string back to bytes
e = new Byte[d.Length];
e = Encoding.UTF8.GetBytes(d);

// Convert bytes back to bit array
BitArray todecode = new BitArray(e);

string decoded = huffmanTree.Decode(todecode);

Console.WriteLine("Decoded: " + decoded);

Console.ReadLine();

The Output of Original code from the tutorial is:

The Output of My Code is:

Where am I wrong friends? Help me, Thanks in advance.


回答1:


You cannot stuff arbitrary bytes into a string. That concept is just undefined. Conversions happen using Encoding.

string output = Encoding.UTF8.GetString(e);

e is just binary garbage at this point, it is not a UTF8 string. So calling UTF8 methods on it does not make sense.

Solution: Don't convert and back-convert to/from string. This does not round-trip. Why are you doing that in the first place? If you need a string use a round-trippable format like base-64 or base-85.




回答2:


I'm pretty sure Encoding doesn't roundtrip - that is you can't encode an arbitrary sequence of bytes to a string, and then use the same Encoding to get bytes back and always expect them to be the same.

If you want to be able to roundtrip from your raw bytes to string and back to the same raw bytes, you'd need to use base64 encoding e.g.

http://blogs.microsoft.co.il/blogs/mneiter/archive/2009/03/22/how-to-encoding-and-decoding-base64-strings-in-c.aspx



来源:https://stackoverflow.com/questions/14670645/bit-array-to-string-and-back-to-bit-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!