How to convert UTF-8 byte[] to string?

后端未结

关注

 15  2342

迷失自我

I have a byte[] array that is loaded from a file that I happen to known contains UTF-8.

In some debugging code, I need to convert it to a string. Is

相关标签:

15条回答

余生分开走

2020-11-22 03:44

Try this console app:

static void Main(string[] args)
{
    //Encoding _UTF8 = Encoding.UTF8;
    string[] _mainString = { "Héllo World" };
    Console.WriteLine("Main String: " + _mainString);

    //Convert a string to utf-8 bytes.
    byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);

    //Convert utf-8 bytes to a string.
    string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
    Console.WriteLine("String Unicode: " + _stringuUnicode);
}

0 讨论(0)

青春惊慌失措

2020-11-22 03:49
In adition to the selected answer, if you're using .NET35 or .NET35 CE, you have to specify the index of the first byte to decode, and the number of bytes to decode:
```
string result = System.Text.Encoding.UTF8.GetString(byteArray,0,byteArray.Length);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
青春惊慌失措

2020-11-22 03:50
I saw some answers at this post and it's possible to be considered completed base knowledge, because have a several approaches in C# Programming to resolve the same problem. Only one thing that is necessary to be considered is about a difference between Pure UTF-8 and UTF-8 with B.O.M..

In last week, at my job, I need to develop one functionality that outputs CSV files with B.O.M. and other CSVs with pure UTF-8 (without B.O.M.), each CSV file Encoding type will be consumed by different non-standardized APIs, that one API read UTF-8 with B.O.M. and the other API read without B.O.M.. I need to research the references about this concept, reading "What's the difference between UTF-8 and UTF-8 without B.O.M.?" Stack Overflow discussion and this Wikipedia link "Byte order mark" to build my approach.

Finally, my C# Programming for the both UTF-8 encoding types (with B.O.M. and pure) needed to be similar like this example bellow:
```
//for UTF-8 with B.O.M., equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);

//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

自闭症患者

2020-11-22 03:55

A general solution to convert from byte array to string when you don't know the encoding:

static string BytesToStringConverted(byte[] bytes)
{
    using (var stream = new MemoryStream(bytes))
    {
        using (var streamReader = new StreamReader(stream))
        {
            return streamReader.ReadToEnd();
        }
    }
}

0 讨论(0)

萌比男神i

2020-11-22 03:56
There're at least four different ways doing this conversion.
1. Encoding's GetString
  , but you won't be able to get the original bytes back if those bytes have non-ASCII characters.
2. BitConverter.ToString
  The output is a "-" delimited string, but there's no .NET built-in method to convert the string back to byte array.
3. Convert.ToBase64String
  You can easily convert the output string back to byte array by using Convert.FromBase64String.
  Note: The output string could contain '+', '/' and '='. If you want to use the string in a URL, you need to explicitly encode it.
4. HttpServerUtility.UrlTokenEncode
  You can easily convert the output string back to byte array by using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is it needs System.Web assembly if your project is not a web project.
A full example:
```
byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters

string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1);  // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results

string s2 = BitConverter.ToString(bytes);   // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
    decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes

string s3 = Convert.ToBase64String(bytes);  // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes

string s4 = HttpServerUtility.UrlTokenEncode(bytes);    // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
攒了一身酷

2020-11-22 03:56
Alternatively:
```
 var byteStr = Convert.ToBase64String(bytes);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...