Replacing characters in C# (ascii)

前端 未结 7 883
挽巷
挽巷 2020-12-05 08:26

I got a file with characters like these: à, è, ì, ò, ù - À. What i need to do is replace those characters with normal characters eg: à = a, è = e and so on..... This is my c

相关标签:
7条回答
  • 2020-12-05 09:31

    I often use an extenstion method based on the version Dana supplied. A quick explanation:

    • Normalizing to form D splits charactes like è to an e and a nonspacing `
    • From this, the nospacing characters are removed
    • The result is normalized back to form D (I'm not sure if this is neccesary)

    Code:

    using System.Linq;
    using System.Text;
    using System.Globalization;
    
    // namespace here
    public static class Utility
    {
        public static string RemoveDiacritics(this string str)
        {
            if (str == null) return null;
            var chars =
                from c in str.Normalize(NormalizationForm.FormD).ToCharArray()
                let uc = CharUnicodeInfo.GetUnicodeCategory(c)
                where uc != UnicodeCategory.NonSpacingMark
                select c;
    
            var cleanStr = new string(chars.ToArray()).Normalize(NormalizationForm.FormC);
    
            return cleanStr;
        }
    }
    
    0 讨论(0)
提交回复
热议问题