Maximum Hex value in regex

前端 未结 5 976
南旧
南旧 2021-02-12 13:27

Without using u flag the hex range that can be used is [\\x{00}-\\x{ff}], but with u flag it goes up to a 4-byte value \\x{7fffffff}

5条回答
  •  说谎
    说谎 (楼主)
    2021-02-12 14:08

    Unicode and UTF-8, UTF-16, UTF-32 encoding

    Unicode is a character set, which specifies a mapping from characters to code points, and the character encodings (UTF-8, UTF-16, UTF-32) specify how to store the Unicode code points.

    In Unicode, a character maps to a single code point, but it can have different representation depending on how it is encoded.

    I don't want to rehash this discussion all over again, so if you are still not clear about this, please read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

    Using the example in the question,

提交回复
热议问题