How do i replace accents (german) in .NET

前端 未结 5 2032
有刺的猬
有刺的猬 2020-12-03 17:36

I need to replace accents in the string to their english equivalents

for example

ä = ae

ö = oe

Ö = Oe

ü = ue

I know to strip

相关标签:
5条回答
  • 2020-12-03 18:13

    If you need to use this on larger strings, multiple calls to Replace() can get inefficient pretty quickly. You may be better off rebuilding your string character-by-character:

    var map = new Dictionary<char, string>() {
      { 'ä', "ae" },
      { 'ö', "oe" },
      { 'ü', "ue" },
      { 'Ä', "Ae" },
      { 'Ö', "Oe" },
      { 'Ü', "Ue" },
      { 'ß', "ss" }
    };
    
    var res = germanText.Aggregate(
                  new StringBuilder(),
                  (sb, c) => map.TryGetValue(c, out var r) ? sb.Append(r) : sb.Append(c)
                  ).ToString();
    
    0 讨论(0)
  • I can't think of any automatic way to do this, so I believe you'd have to do it manually.

    ie.

    string GermanString = "äö";
    GermanString = GermanString.Replace("ä", "ae");
    GermanString = GermanString.Replace("ö", "oe");
    

    How many characters are there? All vowels, in upper and lower case, so, 10? Shouldn't be too much of a job.

    0 讨论(0)
  • 2020-12-03 18:33

    How about using string.Replace:

    string germanText = "Mötörhead";
    string replaced = germanText.Replace("ö", "oe");
    

    (okay, not a real German word, but I couldn't resist)

    You can chain calls to Replace like this

    someText.Replace("ö", "oe").Replace("ä", "ae").Replace("ö", "oe")...
    
    0 讨论(0)
  • 2020-12-03 18:35

    Do just want a mapping of german umlauts to the two-letter (non-umlaut) variant? Here you go; untested, but it handles all german umlauts.

    String replaceGermanUmlauts( String s ) {
        String t = s;
        t = t.Replace( "ä", "ae" );
        t = t.Replace( "ö", "oe" );
        t = t.Replace( "ü", "ue" );
        t = t.Replace( "Ä", "Ae" );
        t = t.Replace( "Ö", "Oe" );
        t = t.Replace( "Ü", "Ue" );
        t = t.Replace( "ß", "ss" );
        return t;
    }
    
    0 讨论(0)
  • 2020-12-03 18:37

    This class removes diacritic characters (é, ì, è, etc.) and replaces umlauts and the German "ß" with their equivalents "ae (ä)", "oe (ö)", "ue (ü)" and "ss (ß)".

    public sealed class UmlautConverter
    {
        private Dictionary<char, string> converter = new Dictionary<char, string>()
        {
            {  'ä', "ae" },
            {  'Ä', "AE" },
            {  'ö', "oe" },
            {  'Ö', "OE" },
            {  'ü', "ue" },
            {  'Ü', "UE" },
            {  'ß', "ss" }
        };
    
        string value = null;
        public UmlautConverter(string value)
        {
            if (!string.IsNullOrWhiteSpace(value))
            {
                this.value = value;
            }
        }
        public string RemoveDiacritics()
        {
            if (string.IsNullOrWhiteSpace(value))
            {
                return null;
            }
    
            string normalizedString = this.value.Normalize();
    
            foreach (KeyValuePair<char, string> item in this.converter)
            {
                string temp = normalizedString;
                normalizedString = temp.Replace(item.Key.ToString(), item.Value);
            }
    
            StringBuilder stringBuilder = new StringBuilder();
    
            for (int i = 0; i < normalizedString.Length; i++)
            {
                normalizedString = normalizedString.Normalize(NormalizationForm.FormD);
                string c = normalizedString[i].ToString();
                if (CharUnicodeInfo.GetUnicodeCategory(Convert.ToChar(c)) != UnicodeCategory.NonSpacingMark)
                {
                    stringBuilder.Append(c);
                }
            }
            return stringBuilder.ToString();
        }
    
        public bool HasUmlaut()
        {
            if (string.IsNullOrWhiteSpace(value))
            {
                return false;
            }
    
            foreach (KeyValuePair<char, string> item in this.converter)
            {
                if (this.value.Contains(item.Key.ToString()))
                {
                    return true;
                }
            }
    
            return false;
        }
    }
    

    Usage:

    Console.WriteLine(new UmlautConverter("Nürnberger Straße").RemoveDiacritics()); // Nuernberger Strasse
    
            Console.WriteLine(new UmlautConverter("Größenwahn").RemoveDiacritics()); // Groessenwahn
            Console.WriteLine(new UmlautConverter("Übermut").RemoveDiacritics()); // UEbermut
            Console.WriteLine(new UmlautConverter("Università").RemoveDiacritics()); // Universita
            Console.WriteLine(new UmlautConverter("Perché").RemoveDiacritics());// Perche
            Console.WriteLine(new UmlautConverter("être").RemoveDiacritics()); // etre
    

    There is a minor bug in the "Übermut" case replacing "Ü" with "UE" instead of Ue". But this can be easily fixed. Enjoy :)

    0 讨论(0)
提交回复
热议问题