问题
I nedd to add a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ
x time but I find this very ugly. So I try \p{L}
but it does not working in JavaScript.
Any Idea ?
my actual regex : [a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ][a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ' ,"-]*[a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ'",]+
I want to have a thing like that : [\p{L}][\p{L}' ,"-]*[\p{L}'",]+
(or smaller than the actual expression)
回答1:
What you need to add is a subset of what you asked for. First you should define what set of characters you need. \pL
means every letter from every language.
It's kind of ugly but doesn't affect performance and rather the best solution to get around such kind of problems in JS. ECMA2018 has a support for \pL
but way far to be implemented by all major browsers.
If it's a personal taste, you could reduce this ugliness a bit:
var characterSet = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp('[' + characterSet + ']' + '[' + characterSet + '\' ,"-]*' + '[' + characterSet + '\'",]+');
This update credits go to @Francesco:
var pCL = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp(`[${pCL}][${pCL}' ,"-]*[${pCL}'",]+`);
console.log(re.source);
回答2:
You have XRegExp addon to support unicode letter matcher:
var unicodeWord = XRegExp("^\\pL+$"); // L: Letter
Here you can see more example matching unicode in javascript
http://xregexp.com/plugins/
来源:https://stackoverflow.com/questions/50178498/no-pl-for-javascript-regex-use-unicode-in-js-regex