Concrete Javascript Regex for Accented Characters (Diacritics)

后端 未结 9 996
庸人自扰
庸人自扰 2020-11-22 17:22

I\'ve looked on Stack Overflow (replacing characters.. eh, how JavaScript doesn\'t follow the Unicode standard concerning RegExp, etc.) and haven\'t really found a concrete

9条回答
  •  囚心锁ツ
    2020-11-22 17:39

    Which of these three approaches is most suited for the task?

    Depends on the task :-) To match exactly all Latin characters and their accented versions, the Unicode ranges probably provide the best solution. They might be extended to all non-whitespace characters, which could be done using the \S character class.

    I'm forcing a field in a UI to match the format: last_name, first_name (last [comma space] first)

    The most basic problem I'm seeing here are not diacritics, but whitespaces. There are a few names that consist of multiple words, e.g. for titles. So you should go with the most generic, that is allowing everything but the comma that distinguishes first from last name:

    /[^,]+,\s[^,]+/
    

    But your second solution with the . character class is just as fine, you only might need to care about multiple commata then.

提交回复
热议问题