StreamWriter and UTF-8 Byte Order Marks

后端 未结 8 781
走了就别回头了
走了就别回头了 2020-11-27 19:02

I\'m having an issue with StreamWriter and Byte Order Marks. The documentation seems to state that the Encoding.UTF8 encoding has byte order marks enabled but when files are

相关标签:
8条回答
  • 2020-11-27 20:00

    The issue is due to the fact that you are using the static UTF8 property on the Encoding class.

    When the GetPreamble method is called on the instance of the Encoding class returned by the UTF8 property, it returns the byte order mark (the byte array of three characters) and is written to the stream before any other content is written to the stream (assuming a new stream).

    You can avoid this by creating the instance of the UTF8Encoding class yourself, like so:

    // As before.
    this.Writer = new StreamWriter(this.Stream, 
        // Create yourself, passing false will prevent the BOM from being written.
        new System.Text.UTF8Encoding());
    

    As per the documentation for the default parameterless constructor (emphasis mine):

    This constructor creates an instance that does not provide a Unicode byte order mark and does not throw an exception when an invalid encoding is detected.

    This means that the call to GetPreamble will return an empty array, and therefore no BOM will be written to the underlying stream.

    0 讨论(0)
  • 2020-11-27 20:07

    Could you please show a situation where it don't produce it ? The only case where the preamble isn't present that I can find is when nothing is ever written to the writer (Jim Mischel seem to have find an other, logical and more likely to be your problem, see it's answer).

    My test code :

    var stream = new MemoryStream();
    using(var writer = new StreamWriter(stream, System.Text.Encoding.UTF8))
    {
        writer.Write('a');
    }
    Console.WriteLine(stream.ToArray()
        .Select(b => b.ToString("X2"))
        .Aggregate((i, a) => i + " " + a)
        );
    
    0 讨论(0)
提交回复
热议问题