How can I use Unicode-aware regular expressions in JavaScript?
For example, there should be something akin to \\w
that can match any code-point in Lette
[^\u0000-\u007F]+
for any characters which is not included ASCII characters.
For example:
function isNonLatinCharacters(s) {
return /[^\u0000-\u007F]/.test(s);
}
console.log(isNonLatinCharacters("身分"));// Japanese
console.log(isNonLatinCharacters("测试"));// Chinese
console.log(isNonLatinCharacters("حمید"));// Persian
console.log(isNonLatinCharacters("테스트"));// Korean
console.log(isNonLatinCharacters("परीक्षण"));// Hindi
console.log(isNonLatinCharacters("מִבְחָן"));// Hebrew
Here are some perfect references:
Unicode range RegExp generator
Unicode Regular Expressions
Unicode 10.0 Character Code Charts
Match Unicode Block Range