Accent in Regular Expression in Java

后端 未结 2 1090
天涯浪人
天涯浪人 2021-02-01 06:45

I\'d like to use Hibernate Validator to validate some columns. The problem, as I understand, is that the \\w marker in java doesn\'t accept letters with accents on them.

相关标签:
2条回答
  • 2021-02-01 07:14

    I had more luck with:

    \p{InCombiningDiacriticalMarks}+
    

    In java I use the following method:

    import java.text.Normalizer;
    import java.text.Normalizer.Form;
    
    public static String removeAccents(String text) {
        return text == null ? null :
            Normalizer.normalize(text, Form.NFD)
                .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
    }
    
    0 讨论(0)
  • 2021-02-01 07:31

    The Java regex documentation has a section on Unicode categories (search for "Classes for Unicode blocks and categories"). If you're just looking for letters, I think \p{L} is the category you want.

    0 讨论(0)
提交回复
热议问题