How do I remove diacritics (accents) from a string in .NET?

前端 未结 20 2844
南方客
南方客 2020-11-21 05:44

I\'m trying to convert some strings that are in French Canadian and basically, I\'d like to be able to take out the French accent marks in the letters while keeping the lett

20条回答
  •  渐次进展
    2020-11-21 06:07

    This works fine in java.

    It basically converts all accented characters into their deAccented counterparts followed by their combining diacritics. Now you can use a regex to strip off the diacritics.

    import java.text.Normalizer;
    import java.util.regex.Pattern;
    
    public String deAccent(String str) {
        String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD); 
        Pattern pattern = Pattern.compile("\\p{InCombiningDiacriticalMarks}+");
        return pattern.matcher(nfdNormalizedString).replaceAll("");
    }
    

提交回复
热议问题