I\'ve looked on Stack Overflow (replacing characters.. eh, how JavaScript doesn\'t follow the Unicode standard concerning RegExp, etc.) and haven\'t really found a concrete
How about this?
/^[a-zA-ZÀ-ÖØ-öø-ÿ]+$/
/^[\pL\pM\p{Zs}.-]+$/u
Explanation:
\pL
- matches any kind of letter from any language\pM
- atches a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)\p{Zs}
- matches a whitespace character that is invisible, but does take up spaceu
- Pattern and subject strings are treated as UTF-8Unlike other proposed regex (such as [A-Za-zÀ-ÖØ-öø-ÿ]
), this will work with all language specific characters, e.g. Šš
is matched by this rule, but not matched by others on this page.
Unfortunately, natively JavaScript does not support these classes. However, you can use xregexp, e.g.
const XRegExp = require('xregexp');
const isInputRealHumanName = (input: string): boolean => {
return XRegExp('^[\\pL\\pM-]+ [\\pL\\pM-]+$', 'u').test(input);
};
The easier way to accept all accents is this:
[A-zÀ-ú] // accepts lowercase and uppercase characters
[A-zÀ-ÿ] // as above but including letters with an umlaut (includes [ ] ^ \ × ÷)
[A-Za-zÀ-ÿ] // as above but not including [ ] ^ \
[A-Za-zÀ-ÖØ-öø-ÿ] // as above but not including [ ] ^ \ × ÷
See https://unicode-table.com/en/ for characters listed in numeric order.