How do I ignore the UTF-8 Byte Order Marker in String comparisons?

后端 未结 3 507
陌清茗
陌清茗 2020-11-28 13:52

I\'m having a problem comparing strings in a Unit Test in C# 4.0 using Visual Studio 2010. This same test case works properly in Visual Studio 2008 (with C# 3.5).

He

相关标签:
3条回答
  • 2020-11-28 14:11

    There is a slightly more efficient way to do it than creating StreamReader and MemoryStream:

    1) If you know that there is always a BOM

    string viaEncoding = Encoding.UTF8.GetString(withBom, 3, withBom.Length - 3);
    

    2) If you don't know, check:

    string viaEncoding;
    if (withBom.Length >= 3 && withBom[0] == 0xEF && withBom[1] == 0xBB && withBom[2] == 0xBF)
        viaEncoding = Encoding.UTF8.GetString(withBom, 3, withBom.Length - 3);
    else
        viaEncoding = Encoding.UTF8.GetString(withBom);
    
    0 讨论(0)
  • 2020-11-28 14:15

    I believe the extra character is removed if you Trim() the decoded string

    0 讨论(0)
  • 2020-11-28 14:21

    Well, I assume it's because the raw binary data includes the BOM. You could always remove the BOM yourself after decoding, if you don't want it - but you should consider whether the byte array should consider the BOM to start with.

    EDIT: Alternatively, you could use a StreamReader to perform the decoding. Here's an example, showing the same byte array being converted into two characters using Encoding.GetString or one character via a StreamReader:

    using System;
    using System.IO;
    using System.Text;
    
    class Test
    {
        static void Main()
        {
            byte[] withBom = { 0xef, 0xbb, 0xbf, 0x41 };
            string viaEncoding = Encoding.UTF8.GetString(withBom);
            Console.WriteLine(viaEncoding.Length);
    
            string viaStreamReader;
            using (StreamReader reader = new StreamReader
                   (new MemoryStream(withBom), Encoding.UTF8))
            {
                viaStreamReader = reader.ReadToEnd();           
            }
            Console.WriteLine(viaStreamReader.Length);
        }
    }
    
    0 讨论(0)
提交回复
热议问题