Unicode strings in .Net with Hebrew letters and numbers

前端 未结 4 1352
梦如初夏
梦如初夏 2021-02-04 15:38

There is a strange behavior when trying to create string which contains a Hebrew letter and a digit. The digit will always be displayed left to the letter. For example:

<
相关标签:
4条回答
  • 2021-02-04 15:58

    The unicode characters "RTL mark" (U+200F) and "LTR mark" (U+200E) were created precisely for this purpose.

    In your example, simply place an LTR mark after the Hebrew character, and the numbers will then be displayed to the right of the Hebrew character, as you wish.

    So your code would be adjusted as follows:

    string A = "\u05E9"; //A Hebrew letter
    string LTRMark = "\u200E"; 
    string B = "23";
    string AB = A + LTRMark + B;
    
    0 讨论(0)
  • 2021-02-04 15:59

    That strange Behavior has explanation. Digits with unicode chars are treated as a part of unicode string. and as Hebrew lang is read right to left, scenario will give

    string A = "\u05E9"; //A Hebrew letter
    string B = "23";
    string AB = A + B;
    

    B comes first, followed by A.

    second scenario:

    string A = "\u20AA"; //Some random Unicode.
    string B = "23";
    string AB = A + B;
    

    A is some unicode, not part of lang that is read right to left. so output is - first A followed by B.

    now consider my own scenario

    string A = "\u05E9";
    string B = "\u05EA";
    string AB = A + B;
    

    both A and B are part of right to left read lang, so AB is B followed by A. not A followed by B.

    EDITED, to answer the comment

    taking into account this scenario -

    string A = "\u05E9"; //A Hebrew letter
    string B = "23";
    string AB = A + B;
    

    The only solution, to get letter followed by digit, is : string AB = B + A;

    prolly, not a solution that will work in general. So, I guess u have to implement some checking conditions and build string according the requirements.

    0 讨论(0)
  • 2021-02-04 15:59
    string A = "\u05E9"; //A Hebrew letter
    string B = "23";
    string AB = B + A; // !
    textBlock1.Text = AB;
    textBlock1.FlowDirection = FlowDirection.RightToLeft;
    //Ouput Ok - A is left to B as intended.
    
    0 讨论(0)
  • 2021-02-04 16:06

    This is because of Unicode Bidirectional Algorithms. If I understand this correctly, the unicode character has an "identifier" that says where it should be when it's next to another word.

    In this case \u05E9 says that it should be to the left. Even if you do:

    var ab = string.Format("{0}{1}", a, b);

    You will still get it to the left. However, if you take another unicoded character such as \u05D9 this will be added to the right because that character is not said to be on the left.

    This is the layout of the language and when outputting this the layout enginge will output it according to the language layout.

    0 讨论(0)
提交回复
热议问题