Matching a Unicode “name” with a JavaScript Regular Expression

折月煮酒 提交于 2019-11-27 02:59:01

问题


In JavaScript we can match individual Unicode codepoints or codepoint ranges by using the Unicode escape sequences, e.g.:

"A".match(/\u0041/) // => ["A"]
"B".match(/[\u0041-\u007A]/) // => ["B"]

But how could we create a regular expression to match a proper name which must include any Unicode "letter" using a JavaScript regular expression? Is there a range of letters? A special regex sequence or character class in JavaScript?

Say my website must validate names that could be in latin based languages as well as Hebrew, Cyrillic, Japanese (Katakana, Hiragana, etc.) is this feasible in JavaScript or is the only sane choice to delegate to a backend language with better Unicode support?


回答1:


Here's a JS plugin that adds Unicode support to RegEx

http://xregexp.com/plugins/




回答2:


I am using for defining unicode of a symbols this site http://www.fileformat.info.

Unicode Blocks (Basic Latin, .+, Cyrillic, .+, Arabic and other): http://www.fileformat.info/info/unicode/block/index.htm

Unicode Character Categories (this does not work in JS): http://www.fileformat.info/info/unicode/category/index.htm

Letters (A-я): http://www.fileformat.info/info/unicode/char/a.htm

Fonts (which chars are supported in each font): http://www.fileformat.info/info/unicode/font/index.htm

Index for all above http://www.fileformat.info/info/unicode/index.htm



来源:https://stackoverflow.com/questions/5571096/matching-a-unicode-name-with-a-javascript-regular-expression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!