How to GetBytes() in C# with UTF8 encoding with BOM?

前端 未结 4 984
余生分开走
余生分开走 2020-11-27 16:06

I\'m having a problem with UTF8 encoding in my asp.net mvc 2 application in C#. I\'m trying let user download a simple text file from a string. I am trying to get bytes arra

相关标签:
4条回答
  • 2020-11-27 16:25

    Remember that .NET strings are all unicode while there stay in memory, so if you can see your csvString correctly with the debugger the problem is writing the file.

    In my opinion you should return a FileResult with the same encoding that the files. Try setting the returning File encoding,

    0 讨论(0)
  • 2020-11-27 16:40

    UTF-8 does not require a BOM, because it is a sequence of 1-byte words. UTF-8 = UTF-8BE = UTF-8LE.

    In contrast, UTF-16 requires a BOM at the beginning of the stream to identify whether the remainder of the stream is UTF-16BE or UTF-16LE, because UTF-16 is a sequence of 2-byte words and the BOM identifies whether the bytes in the words are BE or LE.

    The problem does not lie with the Encoding.UTF8 class. The problem lies with whatever program you are using to view the files.

    0 讨论(0)
  • 2020-11-27 16:47

    Try like this:

    public ActionResult Download()
    {
        var data = Encoding.UTF8.GetBytes("some data");
        var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
        return File(result, "application/csv", "foo.csv");
    }
    

    The reason is that the UTF8Encoding constructor that takes a boolean parameter doesn't do what you would expect:

    byte[] bytes = new UTF8Encoding(true).GetBytes("a");
    

    The resulting array would contain a single byte with the value of 97. There's no BOM because UTF8 doesn't require a BOM.

    0 讨论(0)
  • 2020-11-27 16:48

    I created a simple extension to convert any string in any encoding to its representation of byte array when it is written to a file or stream:

    public static class StreamExtensions
    {
        public static byte[] ToBytes(this string value, Encoding encoding)
        {
            using (var stream = new MemoryStream())
            using (var sw = new StreamWriter(stream, encoding))
            {
                sw.Write(value);
                sw.Flush();
                return stream.ToArray();
            }
        }
    }
    

    Usage:

    stringValue.ToBytes(Encoding.UTF8)
    

    This will work also for other encodings like UTF-16 which requires the BOM.

    0 讨论(0)
提交回复
热议问题