transliteration

Slugify and Character Transliteration in C#

余生颓废 提交于 2019-11-27 01:24:10
问题 I'm trying to translate the following slugify method from PHP to C#: http://snipplr.com/view/22741/slugify-a-string-in-php/ Edit: For the sake of convenience, here the code from above: /** * Modifies a string to remove al non ASCII characters and spaces. */ static public function slugify($text) { // replace non letter or digits by - $text = preg_replace('~[^\\pL\d]+~u', '-', $text); // trim $text = trim($text, '-'); // transliterate if (function_exists('iconv')) { $text = iconv('utf-8', 'us

Transliteration in ruby

假装没事ソ 提交于 2019-11-26 22:24:53
问题 What is the simplest way for transliteration of non English characters in ruby. That is conversion such as: translit "Gévry" #=> "Gevry" 回答1: Ruby has an Iconv library in its stdlib which converts encodings in a very similar way to the usual iconv command 回答2: Use the UnicodeUtils gem. This works in 1.9 and 2.0. Iconv has been deprecated in these releases. gem install unicode_utils Then try this in IRB: 2.0.0p0 :001 > require 'unicode_utils' #=> true 2.0.0p0 :002 > r = "Résumé" #=> "Résumé" 2

Transliteration from Cyrillic to Latin ICU4j java [duplicate]

我怕爱的太早我们不能终老 提交于 2019-11-26 21:48:08
问题 This question already has answers here : icu4j cyrillic to latin (3 answers) Closed 3 years ago . I need to do something rather simple but without hash mapping hard coding. I have a String s and it is in Cyrillic I need some sort of example on how to turn it into Latin characters using a custom filter of a sort (to give a purely Latin example as to not confuse anyone if String s = sniff; I want it to look up s-n-i-f-f and change them into something else (there might also be combinations). I

How to transliterate Cyrillic to Latin text

馋奶兔 提交于 2019-11-26 19:52:40
问题 I have a method which turns any Latin text (e.g. English, French, German, Polish) into its slug form, e.g. Alpha Bravo Charlie => alpha-bravo-charlie But it can't work for Cyrillic text (e.g. Russian), so what I'm wanting to do is transliterate the Cyrillic text to Latin characters, then slugify that. Does anyone have a way to do such transliteration? Whether by actual source or a library. I'm coding in C#, so a .NET library will work. Alternatively, if you have non-C# code, I'm sure I could

Python and character normalization

﹥>﹥吖頭↗ 提交于 2019-11-26 17:42:24
问题 Hello I retrieve text based utf8 data from a foreign source which contains special chars such as u"ıöüç" while I want to normalize them to English such as "ıöüç" -> "iouc" . What would be the best way to achieve this ? 回答1: I recommend using Unidecode module: >>> from unidecode import unidecode >>> unidecode(u'ıöüç') 'iouc' Note how you feed it a unicode string and it outputs a byte string. The output is guaranteed to be ASCII. 回答2: It all depends on how far you want to go in transliterating

PHP Transliteration

给你一囗甜甜゛ 提交于 2019-11-26 11:52:14
Are there any solutions that will convert all foreign characters to A-z equivalents? I have searched extensively on Google and could not find a solution or even a list of characters and equivalents. The reason is I want to display A-z only URLs, plus plenty of other trip ups when dealing with these characters. You can use iconv , which has a special transliteration encoding. When the string "//TRANSLIT" is appended to tocode, transliteration is activated. This means that when a character cannot be represented in the target character set, it can be approximated through one or several characters

How do you map-replace characters in Javascript similar to the 'tr' function in Perl?

白昼怎懂夜的黑 提交于 2019-11-26 10:35:11
问题 I\'ve been trying to figure out how to map a set of characters in a string to another set similar to the tr function in Perl. I found this site that shows equivalent functions in JS and Perl, but sadly no tr equivalent. the tr (transliteration) function in Perl maps characters one to one, so data =~ tr|\\-_|+/|; would map - => + and _ => / How can this be done efficiently in JavaScript? 回答1: There isn't a built-in equivalent, but you can get close to one with replace: data = data.replace(/[\-

PHP Transliteration

こ雲淡風輕ζ 提交于 2019-11-26 02:37:50
问题 Are there any solutions that will convert all foreign characters to A-z equivalents? I have searched extensively on Google and could not find a solution or even a list of characters and equivalents. The reason is I want to display A-z only URLs, plus plenty of other trip ups when dealing with these characters. 回答1: You can use iconv, which has a special transliteration encoding. When the string "//TRANSLIT" is appended to tocode, transliteration is activated. This means that when a character