How to make MySQL aware of multi-byte characters in LIKE and REGEXP?

后端 未结 3 737
野趣味
野趣味 2021-02-19 17:27

I have a MySQL table with two columns, both utf8_unicode_ci collated. It contains the following rows. Except for ASCII, the second field also contains Unicode codepoints like U+

3条回答
  •  梦毁少年i
    2021-02-19 18:07

    You have problems with UTF8? Eliminate them.

    How many special characters do you use? Are you using only locase letters, am I right? So, my tip is: Write a function, which converts spec chars to regular chars, e.g. "æ" ->"A" and so on, and add a column to the table which stores that converted value (you have to convert all values first, and upon each insert/update). When searching, you just have to convert search string with the same function, and use it on that field with regexp.

    If there're too many kind of special chars, you should convert it to multi-char. 1. Avoid finding "aa" in the "ba ab" sequence use some prefix, like "@ba@ab". 2. Avoid finding "@a" in "@ab" use fixed length tokens, say, 2.

提交回复
热议问题