Where can I find a list of escaped characters in MSIL string constants?

后端 未结 1 1024
一生所求
一生所求 2021-02-18 21:31

I\'ve written a program (in C#) that reads and manipulates MSIL programs that have been generated from C# programs. I had mistakenly assumed that the syntax rules for MSIL strin

相关标签:
1条回答
  • 2021-02-18 21:51

    Update

    Based on experimentation using the C# compiler + ildasm.exe: perhaps the reason there is no list of escaped characters is because there are so few: precisely 6.

    Going from the IL generated by ildasm, from C# programs compiled by Visual Studio 2010:

    • IL is strictly ASCII.
    • Three traditional whitespace characters are escaped
      • \t : 0x09 : (tab)
      • \n : 0x0A : (newline)
      • \r : 0x0D : (carriage return)
    • Three punctuation characters are escaped:
      • \" : 0x22 : (double quote)
      • \? : 0x3F : (question mark)
      • \\ : 0x5C : (backslash)
    • Only the following characters are included intact in literal strings 0x20 - 0x7E, (not including the three punctuation characters)
    • All other characters, including the ASCII contrl characters below 0x20 and everything from 0x7F on up, are converted to byte arrays. Or rather, any string containing any character other than the 92 literal and 6 escaped characters above, is converted to a byte array, where the bytes are the little-endian bytes of a UTF-16 string.

    Example 1: ASCII above 0x7E: A simple accented é (U+00E9)

    C#: Either "é" or "\u00E9" becomes (E9 byte comes first)

    ldstr      bytearray (E9 00 )
    

    Example 2: UTF-16: Summation symbol ∑ (U+2211)

    C#: Either "∑" or "\u2211" becomes (11 byte comes first)

    ldstr      bytearray (11 22 )
    

    Example 3: UTF-32: Double-struck mathematical

    0 讨论(0)
提交回复
热议问题