Removing accent marks (diacritics) from Latin characters for comparison [duplicate]

余生长醉 提交于 2019-12-03 14:22:54
BalusC

You can make use of java.text.Normalizer and a little regex to get rid of the diacritical marks.

public static String removeDiacriticalMarks(String string) {
    return Normalizer.normalize(string, Form.NFD)
        .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}

Usage example:

String text = "mšk žilina";
String normalized = removeDiacriticalMarks(text);
System.out.println(normalized); // msk zilina
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!