Does MySQL Regexp support Unicode matching

前端 未结 3 2059
你的背包
你的背包 2020-11-29 09:50

Does anyone know if Mysql\'s regexp supports unicode? I\'ve been doing some research and the majority of blogs etc. seem to indicate that there is a problem or its not supp

相关标签:
3条回答
  • 2020-11-29 10:16

    Starting with Mysql 8.0, unicode matching is supported

    See also the documentation for compatibility issues

    0 讨论(0)
  • 2020-11-29 10:20

    MariaDB starting with 10.0.5 :

    REGEXP/RLIKE, and the new functions REGEXP_REPLACE(), REGEXP_INSTR() and REGEXP_SUBSTR(), now work correctly with all multi-byte character sets supported by MariaDB, including East-Asian character sets (big5, gb2313, gbk, eucjp, eucjpms, cp932, ujis, euckr), and Unicode character sets (utf8, utf8mb4, ucs2, utf16, utf16le, utf32). In earlier versions of MariaDB (and all MySQL versions) REGEXP/RLIKE works correctly only with 8-bit character sets.

    0 讨论(0)
  • 2020-11-29 10:31
    1. Does anyone know if Mysql's regexp supports unicode? I've been doing some research and the majority of blogs etc. seem to indicate that there is a problem or its not supported.

      As documented under Regular Expressions:

      Warning

      The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

    2. I'm wondering then is it best to use LIKE for unicode pattern matching and regexp for ASCII enhanced pattern matching?

      Yes, that would be best.

    3. I Like the idea of being able to search for matches at the beginning or end of a string, but if regexp doesn't support unicode then this could be difficult if my text is unicode.

      One can do that with LIKE too:

      WHERE foo LIKE 'bar%'
      

      And:

      WHERE foo LIKE '%bar'
      
    0 讨论(0)
提交回复
热议问题