How to make MySQL aware of multi-byte characters in LIKE and REGEXP?

后端 未结 3 1093
灰色年华
灰色年华 2021-02-19 17:30

I have a MySQL table with two columns, both utf8_unicode_ci collated. It contains the following rows. Except for ASCII, the second field also contains Unicode codepoints like U+

3条回答
  •  北荒
    北荒 (楼主)
    2021-02-19 17:58

    EDITED to incorporate fix to valid critisism

    Use the HEX() function to render your bytes to hexadecimal and then use RLIKE on that, for example:

    select * from mytable
    where hex(ipa) rlike concat('(..)*', hex('needle'), '(..)*'); -- looking for 'needle' in haystack, but maintaining hex-pair alignment.
    

    The odd unicode chars render consistently to their hex values, so you're searching over standard 0-9A-F chars.

    This works for "normal" columns too, you just don't need it.

    p.s. @Kieren's (valid) point addressed using rlike to enforce char pairs

提交回复
热议问题