JavaScript + Unicode regexes

前端 未结 11 1110
星月不相逢
星月不相逢 2020-11-21 05:11

How can I use Unicode-aware regular expressions in JavaScript?

For example, there should be something akin to \\w that can match any code-point in Lette

11条回答
  •  梦如初夏
    2020-11-21 05:47

    In JavaScript, \w and \d are ASCII, while \s is Unicode. Don't ask me why. JavaScript does support \p with Unicode categories, which you can use to emulate a Unicode-aware \w and \d.

    For \d use \p{N} (numbers)

    For \w use [\p{L}\p{N}\p{Pc}\p{M}] (letters, numbers, underscores, marks)

    Update: Unfortunately, I was wrong about this. JavaScript does does not officially support \p either, though some implementations may still support this. The only Unicode support in JavaScript regexes is matching specific code points with \uFFFF. You can use those in ranges in character classes.

提交回复
热议问题