Six digit unicode escaped value comparison

前端 未结 2 631
礼貌的吻别
礼貌的吻别 2021-01-13 19:48

I have a six digit unicode character, for example U+100000 which I wish to make a comparison with a another char in my C# code.

My reading

相关标签:
2条回答
  • 2021-01-13 20:34

    To construct a string with the Unicode code point U+10FFFF using a string literal, you need to work out the surrogate pair involved.

    In this case, you need:

    string bigCharacter = "\uDBFF\uDFFF";
    

    Or you can use char.ConvertFromUtf32:

    string bigCharacter = char.ConvertFromUtf32(0x10FFFF);
    

    It's not clear what you want your method to achieve, but if you need it to work with characters not in the BMP, you'll need to make it accept int instead of char, or a string.

    As per the documentation for string, if you want to iterate over characters in a string as full Unicode values, use TextElementEnumerator or StringInfo.

    Note that you do need to do this explicitly. If you just use ordinal values, it will check UTF-16 code units, not the UTF-32 code points. For example:

    string text = "\uF000";
    string upperBound = "\uDBFF\uDFFF";
    Console.WriteLine(string.Compare(text, upperBound, StringComparison.Ordinal));
    

    This prints out a value greater than zero, suggesting that text is greater than upperBound here. Instead, you should use char.ConvertToUtf32:

    string text = "\uF000";
    string upperBound = "\uDBFF\uDFFF";
    int textUtf32 = char.ConvertToUtf32(text, 0);
    int upperBoundUtf32 = char.ConvertToUtf32(upperBound, 0);
    Console.WriteLine(textUtf32 < upperBoundUtf32); // True
    

    So that's probably what you need to do in your method. You might want to use StringInfo.LengthInTextElements to check that the strings really are single UTF-32 code points first.

    0 讨论(0)
  • 2021-01-13 20:48

    From https://msdn.microsoft.com/library/aa664669.aspx, you have to use \U with full 8 hex digits. So for example:

    string str1 = "\U0001F300";
    string str2 = "\uD83C\uDF00";
    bool eq = str1 == str2;
    

    using the :cyclone: emoji.

    0 讨论(0)
提交回复
热议问题