Search with Turkish characters

老子叫甜甜 提交于 2020-08-26 10:41:07

问题


I have problem on db search with like and elastic search in Turkish upper and lower case.

For example I have posts table which contains post titled 'DENEME YAZI'.

If I run this query:

select * from posts where title like '%deneme%';

or:

select * from posts where title like '%YAZI%';

I get correct result but if I run:

select * from posts where title like '%yazı%';

it doesn't return any record. My database encoding is tr_TR.UTF-8. How can I get correct results without entering exact word?


回答1:


You must use ILIKE for case insensitive matches:

select * from posts where title ilike '%yazı%';

However, there is the additional complication of peculiar rules in the Turkish locale. Upper case of 'ı' is 'I'. But not the other way round. Lower case of 'I' is 'i':

db=# SELECT lower(upper('ı'));
 lower
-------
 i

You could solve that by applying upper() on either side of the LIKE expression:

select upper('DENEME YAZI') like ('%' || upper('yazı') || '%');



回答2:


Applying just a single UPPER (or LOWER) on either side of the expression is not a solution. You should handle problematic Turkish characters (ıI-iİ) by yourself.

  • İ and i are the same letters in Turkish alphabet.
  • I and ı are the same letters in Turkish alphabet.

But even using UTF-8, Latin5, Windows 1254 Encoding and collation settings in postgre

  • UPPER('İ') returns 'İ' OK
  • UPPER('i') return 'I' Not OK
  • UPPER('I') returns 'I' OK
  • UPPER('ı') return 'İ' Not OK

so

  • SELECT ... FROM ... WHERE ... UPPER('İZMİR') like UPPER('izmir') return false
  • SELECT ... FROM ... WHERE ... UPPER('ISPARTA') like UPPER('ısparta') return false.

Here's some more precise but not perfect solution because of performance issues

SELECT ... FROM ... WHERE ... 
UPPER(REPLACE(REPLACE(COLUMNX, 'i', 'İ'), 'ı', 'I')) = UPPER(REPLACE(REPLACE(myvalue, 
'i', 'İ'), 'ı', 'I'))

or

SELECT ... FROM ... WHERE ... 
UPPER(TRANSLATE('COLUMNX','ıi','Iİ')) = UPPER(TRANSLATE(myvalue,'ıi','Iİ'))


来源:https://stackoverflow.com/questions/24295566/search-with-turkish-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!