transliteration

transliterating cyrillic to latin with javascript function

主宰稳场 提交于 2019-12-18 11:58:54
问题 I made this function: function transliterate(word){ var answer = ""; A = new Array(); A["Ё"]="YO";A["Й"]="I";A["Ц"]="TS";A["У"]="U";A["К"]="K";A["Е"]="E";A["Н"]="N";A["Г"]="G";A["Ш"]="SH";A["Щ"]="SCH";A["З"]="Z";A["Х"]="H";A["Ъ"]="'"; A["ё"]="yo";A["й"]="i";A["ц"]="ts";A["у"]="u";A["к"]="k";A["е"]="e";A["н"]="n";A["г"]="g";A["ш"]="sh";A["щ"]="sch";A["з"]="z";A["х"]="h";A["ъ"]="'"; A["Ф"]="F";A["Ы"]="I";A["В"]="V";A["А"]="A";A["П"]="P";A["Р"]="R";A["О"]="O";A["Л"]="L";A["Д"]="D";A["Ж"]="ZH";A[

Transliterate any convertible utf8 char into ascii equivalent

早过忘川 提交于 2019-12-17 23:37:40
问题 Is there any good solution out there that does this transliteration in a good manner? I've tried using iconv() , but is very annoying and it does not behave as one might expect. Using //TRANSLIT will try to replace what it can, leaving everything nonconvertible as "?" Using //IGNORE will not leave "?" in text, but will also not transliterate and will also raise E_NOTICE when nonconvertible char is found, so you have to use iconv with @ error suppressor Using //IGNORE//TRANSLIT (as some people

How to convert (transliterate) a string from utf8 to ASCII (single byte) in c#?

百般思念 提交于 2019-12-17 10:44:29
问题 I have a string object "with multiple characters and even special characters" I am trying to use UTF8Encoding utf8 = new UTF8Encoding(); ASCIIEncoding ascii = new ASCIIEncoding(); objects in order to convert that string to ascii. May I ask someone to bring some light to this simple task, that is hunting my afternoon. EDIT 1: What we are trying to accomplish is getting rid of special characters like some of the special windows apostrophes. The code that I posted below as an answer will not

Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

末鹿安然 提交于 2019-12-16 20:02:02
问题 I am looking at an algorithm that can map between characters with diacritics (tilde, circumflex, caret, umlaut, caron) and their "simple" character. For example: ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ --> n á --> a ä --> a ấ --> a ṏ --> o Etc. I want to do this in Java, although I suspect it should be something Unicode-y and should be doable reasonably easily in any language. Purpose: to allow easily search for words with diacritical marks. For example, if I have a database of tennis players, and

Solr, Special Chars, and Latin to Cyrillic char conversion

独自空忆成欢 提交于 2019-12-14 03:48:49
问题 I am trying to setup a search engine using Solr (or Lucene) which could have text in both Latin with special chars, (special chars would include Ö or Ç as an example) or Cyrilic chars (examples include Б or б and Ж ж). Anyway, I am trying to find a solution to allow me to search for words with these charicters in them, but for users who do not have the key on their keyboard... Example would be (making up words here, hopefully won't offend anyone): "BÖÖK" would be found when searching for

How to use Google transliteration API in my java web application?

馋奶兔 提交于 2019-12-12 10:48:01
问题 How to use Google Transliteration API in my Java application. If i give a String (either in English or Arabic ) as input, the Google Transliterator API then it should translate it into the corresponding other language and give the transliterated string to me. I also want to know is it better to use Google Translator or transliterator? How to do this? Any Suggestions Please. I need to use this in my JAVA program. 回答1: There's a Java API. See the docs here for how to use it. An example of how

Transliteration and fuzzy search, like Google suggestions

拜拜、爱过 提交于 2019-12-11 18:58:05
问题 I need to do a fuzzy search with transliteration of the characters, for example: I have an ASP.NET application, database, which has a table with a list of Spanish words (200,000 entries), I also have a page with an input field. The point is that I do not know Spanish, and I do not know how to spell a search word in Spanish, but I know how it sounds. Therefore, in the text box I enter the search word, such as "beautiful", but in the recording err - "prekieso", and I need to get from the

Google Transliteration Suggestion CSS not proper

天涯浪子 提交于 2019-12-11 17:39:20
问题 I followed the steps to solve the problem of Transliteration API not being served over HTTPS: Javascript google transliterate API not served over https I extracted Google JSAPI & Transliteration.I.js to my own file and added https. But After that, the suggestions pop up in a div at the bottom of the page and not like the usual dropdown. Would appreciate some help. 回答1: Here is a step by step process: First, there's a link to API: <script type="text/javascript" src="https://www.google.com

Converting accents to ASCII in R

不羁岁月 提交于 2019-12-11 08:04:33
问题 I'm trying to convert special characters to ASCII in R. I tried using Hadley's advice in this question: stringi::stri_trans_general('Jos\xe9', 'latin-ascii') But I get "Jos�". I'm using stringi v1.1.1. I'm running a Mac. My friends who are running Windows machines seem to get the desired result of "Jose". Any idea what is going on? 回答1: The default encoding on Windows is different from the typical default encoding on other operating systems (UTF-8). x ='Jos\xe9' means something in Latin1, but

Converting Unicode characters into the equivalent ASCII ones

落爺英雄遲暮 提交于 2019-12-10 10:13:49
问题 I need to "flatten out" a number of Unicode strings for the purposes of indexing and searching. For example, I need to convert GötheФ€ into ASCII. The last two characters have no close representations in ASCII so it's Ok to discard them completely. So what I expect from echo iconv("UTF-8", "ASCII//TRANSLIT//IGNORE", "GötheФ€"); is Gothe but instead it outputs Gothe?EUR . In addition to letters, I'd also like all the variety of Unicode numerals and punctuation marks, such as periods, commas,