问题
In JavaScript we can match individual Unicode codepoints or codepoint ranges by using the Unicode escape sequences, e.g.:
"A".match(/\u0041/) // => ["A"]
"B".match(/[\u0041-\u007A]/) // => ["B"]
But how could we create a regular expression to match a proper name which must include any Unicode "letter" using a JavaScript regular expression? Is there a range of letters? A special regex sequence or character class in JavaScript?
Say my website must validate names that could be in latin based languages as well as Hebrew, Cyrillic, Japanese (Katakana, Hiragana, etc.) is this feasible in JavaScript or is the only sane choice to delegate to a backend language with better Unicode support?
回答1:
Here's a JS plugin that adds Unicode support to RegEx
http://xregexp.com/plugins/
回答2:
I am using for defining unicode of a symbols this site http://www.fileformat.info.
Unicode Blocks (Basic Latin, .+, Cyrillic, .+, Arabic and other): http://www.fileformat.info/info/unicode/block/index.htm
Unicode Character Categories (this does not work in JS): http://www.fileformat.info/info/unicode/category/index.htm
Letters (A-я): http://www.fileformat.info/info/unicode/char/a.htm
Fonts (which chars are supported in each font): http://www.fileformat.info/info/unicode/font/index.htm
Index for all above http://www.fileformat.info/info/unicode/index.htm
来源:https://stackoverflow.com/questions/5571096/matching-a-unicode-name-with-a-javascript-regular-expression