diacritics

Optimize regular expression for filtering thousands of HTML select options

孤街浪徒 提交于 2019-12-23 22:35:01
问题 Background I developed a jQuery-based shuttle widget for HTML select elements because I could not find one that was minimally codified and offered a regular expression filter that compensated for diacritics. Problem When a few thousand entries are added to the select , the regular expression filter slows to a crawl. You can see the problem as follows: Browse to: http://jsfiddle.net/U8Xre/2/ Click the input field in the result panel. Type any regular expression (e.g., ^a.*ai ). Code I believe

How can I remove diacritics (umlauts) from a String?

谁说我不能喝 提交于 2019-12-23 15:11:14
问题 How can I convert a string, such as Příliš žluťoučký kůň úpěl ďábelské ódy. into Prilis zlutoucky kun upel dabelske ody. ? The source string is in Unicode, so in principle it should be possible to use normalization/decomposition to separate the umlaut. Unfortunately I didn't see any library in Pharo (maybe Zinc hidden somewhere?) that would support either stripping umlauts or decomposition. 回答1: You can try Diacriticals package Installation Metacello new smalltalkhubUser: 'Pharo' project:

MySQL diacritic insensitive search (Arabic)

旧时模样 提交于 2019-12-23 12:36:45
问题 I have trouble making a diacritic insensitive search with arabic text. I have tested multiple setups for the table in question: encodings in utf8 and utf16 as well as collations in utf8_general_ci, utf16_general_ci and utf16_unicode_ci. The search works for åä special characters. I.e: select * from test where text like '%a%' Would return columns where text is a, å or ä. But it won't work with the Arabic diacritics. I.e if the text is بِسْمِ and I search for بسم, I don't get any hits. Any

Diacritic chars in a Jasper Report template

南笙酒味 提交于 2019-12-23 12:35:01
问题 I have to use Polish language to fill my report content, so I have to use diacritic chars (ą, ć, ę, ł, ó, ż, ź). And I have problem with them, they are skipped after exporting jasper print to an output. When I write in a template "lubię żółwie" (means "I like turtles" in Polish), an output pdf contains only "lubi wie" (btw it means "he likes he knows" - so it changes a lot ;)). Even there are no empty spaces in place of missing letters. They are just skipped. An additional hint is it doesn't

regex in Vietnamese characters

淺唱寂寞╮ 提交于 2019-12-23 09:30:08
问题 I have one string and want remove any character not in any case below: not in this list : ÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚĂĐĨŨƠàáâãèéêìíòóôõùúăđĩũơƯĂẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼỀỀỂ ưăạảấầẩẫậắằẳẵặẹẻẽềềểỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪễệỉịọỏốồổỗộớờởỡợụủứừỬỮỰỲỴÝỶỸửữựỳỵỷỹ not in [a-z 0-9 A-Z] not is : _ and white space. can anyone help me with this regex in php? 回答1: Try this regular expression: /[^a-z0-9A-Z

PHP str_getcsv removes umlauts

倾然丶 夕夏残阳落幕 提交于 2019-12-23 09:26:09
问题 I encountered a little problem when parsing CSV-Strings that contain german umlauts (-> ä, ö, ü, Ä, Ö, Ü) in PHP. Assume the following csv input string: w;x;y;z 48;OSL;Oslo Stock Exchange;B 49;OTB;Österreichische Termin- und Optionenbörse;C 50;VIE;Wiener Börse;D And the appropriate PHP code used to parse the string and create an array which contains the data from the csv-String: public static function parseCSV($csvString) { $rows = str_getcsv($csvString, "\n"); // Remove headers .. $header =

Working with characters with accents in sql query and table name

不羁岁月 提交于 2019-12-23 03:11:25
问题 I'm doing some php & SQL Server 2005 in a database with accents ( é , è , à ) in both tables names , columns names and fields . Unfortunately , I'm not the owner/creator of this database , but I agree that the owner must be slapped :) . Im using ODBC driver to connect to the SQL Server odbc_connect($dsn,$user,$password) . My problem is that every fields with accents is not recognized . For example : despite having 7000 fields with the name "Réseau" $query="Select * from dbo.Table where col1=

SQLALCHEMY ignore accents on query

て烟熏妆下的殇ゞ 提交于 2019-12-22 07:39:23
问题 Considering my users can save data as "café" or "cafe", I need to be able to search on that fields with an accent-insensitive query. I've found https://github.com/djcoin/django-unaccent/, but I have no idea if it is possible to implement something similar on sqlalchemy. I'm using PostgreSQL, so if the solution is specific to this database is good to me. If it is generic solution, it is much much better. Thanks for your help. 回答1: First install the unaccess extension in PostgreSQL: create

SQLALCHEMY ignore accents on query

∥☆過路亽.° 提交于 2019-12-22 07:38:22
问题 Considering my users can save data as "café" or "cafe", I need to be able to search on that fields with an accent-insensitive query. I've found https://github.com/djcoin/django-unaccent/, but I have no idea if it is possible to implement something similar on sqlalchemy. I'm using PostgreSQL, so if the solution is specific to this database is good to me. If it is generic solution, it is much much better. Thanks for your help. 回答1: First install the unaccess extension in PostgreSQL: create

What's the correct algorithm to determine number of user-perceived-characters?

≡放荡痞女 提交于 2019-12-22 03:32:22
问题 I have the task of counting the number of perceived characters in an input. The input is a group of ints (we can think of it as an int[] ) which represents Unicode code points. java.text.BreakIterator.getCharacterInstance() is not allowed. (I mean their formula is allowed and is what I wanted, but weaving through their source code and state tables got me nowhere >.<) I was wondering what's the correct algorithm to count the number of grapheme-clusters given some code points? Initially, I'd