How to transliterate Cyrillic to Latin text

后端 未结 10 1296
情话喂你
情话喂你 2020-12-05 05:01

I have a method which turns any Latin text (e.g. English, French, German, Polish) into its slug form,

e.g. Alpha Bravo Charlie => alpha-bravo-char

相关标签:
10条回答
  • 2020-12-05 05:48

    You can use .NET open source dll library UnidecodeSharpFork to transliterate Cyrillic and many more languages to Latin.

    Example usage:

    Assert.AreEqual("Rabota s kirillitsey", "Работа с кириллицей".Unidecode());
    Assert.AreEqual("CZSczs", "ČŽŠčžš".Unidecode());
    Assert.AreEqual("Hello, World!", "Hello, World!".Unidecode());
    

    Testing Cyrillic:

    /// <summary>
    /// According to http://en.wikipedia.org/wiki/Romanization_of_Russian BGN/PCGN.
    /// http://en.wikipedia.org/wiki/BGN/PCGN_romanization_of_Russian
    /// With converting "ё" to "yo".
    /// </summary>
    [TestMethod]
    public void RussianAlphabetTest()
    {
        string russianAlphabetLowercase = "а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я";
        string russianAlphabetUppercase = "А Б В Г Д Е Ё Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я";
    
        string expectedLowercase = "a b v g d e yo zh z i y k l m n o p r s t u f kh ts ch sh shch \" y ' e yu ya";
        string expectedUppercase = "A B V G D E Yo Zh Z I Y K L M N O P R S T U F Kh Ts Ch Sh Shch \" Y ' E Yu Ya";
    
        Assert.AreEqual(expectedLowercase, russianAlphabetLowercase.Unidecode());
        Assert.AreEqual(expectedUppercase, russianAlphabetUppercase.Unidecode());
    }
    

    Simple, fast and powerful. And it's easy to extend/modify transliteration table if you want to.

    0 讨论(0)
  • 2020-12-05 05:52

    Optimized the answer of Sarvar Nishonboev, seems like a simplest solution without unnecessary complexity related to the re-creating of string at each iteration:

    public static class Converter
    {
        private static readonly Dictionary<char, string> ConvertedLetters = new Dictionary<char, string>
        {
            {'а', "a"},
            {'б', "b"},
            {'в', "v"},
            {'г', "g"},
            {'д', "d"},
            {'е', "e"},
            {'ё', "yo"},
            {'ж', "zh"},
            {'з', "z"},
            {'и', "i"},
            {'й', "j"},
            {'к', "k"},
            {'л', "l"},
            {'м', "m"},
            {'н', "n"},
            {'о', "o"},
            {'п', "p"},
            {'р', "r"},
            {'с', "s"},
            {'т', "t"},
            {'у', "u"},
            {'ф', "f"},
            {'х', "h"},
            {'ц', "c"},
            {'ч', "ch"},
            {'ш', "sh"},
            {'щ', "sch"},
            {'ъ', "j"},
            {'ы', "i"},
            {'ь', "j"},
            {'э', "e"},
            {'ю', "yu"},
            {'я', "ya"},
            {'А', "A"},
            {'Б', "B"},
            {'В', "V"},
            {'Г', "G"},
            {'Д', "D"},
            {'Е', "E"},
            {'Ё', "Yo"},
            {'Ж', "Zh"},
            {'З', "Z"},
            {'И', "I"},
            {'Й', "J"},
            {'К', "K"},
            {'Л', "L"},
            {'М', "M"},
            {'Н', "N"},
            {'О', "O"},
            {'П', "P"},
            {'Р', "R"},
            {'С', "S"},
            {'Т', "T"},
            {'У', "U"},
            {'Ф', "F"},
            {'Х', "H"},
            {'Ц', "C"},
            {'Ч', "Ch"},
            {'Ш', "Sh"},
            {'Щ', "Sch"},
            {'Ъ', "J"},
            {'Ы', "I"},
            {'Ь', "J"},
            {'Э', "E"},
            {'Ю', "Yu"},
            {'Я', "Ya"}
        };
    
        public static string ConvertToLatin(string source)
        {
            var result = new StringBuilder();
            foreach (var letter in source)
            {
                result.Append(ConvertedLetters[letter]);
            }
            return result.ToString();
        }
    }
    

    Use it like this:

    Converter.ConvertToLatin("Проверочный текст");
    
    0 讨论(0)
  • 2020-12-05 05:53

    You can use my library for transliteration: https://github.com/nick-buhro/Translit
    It is also available on NuGet.

    Example:

    var latin = Transliteration.CyrillicToLatin(
        "Предками данная мудрость народная!", 
        Language.Russian);
    
    Console.WriteLine(latin);   
    // Output: Predkami dannaya mudrost` narodnaya!
    
    0 讨论(0)
  • 2020-12-05 06:00

    For future readers

    Windows 7+ can do this with its Extended Linguistic Services. (You'll need the Windows API Code Pack to do it from .NET)

    0 讨论(0)
提交回复
热议问题