Does a utf8_unicode_cs collation exist?

后端 未结 2 1533
独厮守ぢ
独厮守ぢ 2021-01-02 03:33

Does anyone know if a utf8_unicode_cs collation for MySQL exists? So far, my searches have come up dry. If it simply doesn\'t exist yet, is it fairly straight-forward to cre

相关标签:
2条回答
  • 2021-01-02 04:13

    I came across the same issue and after some Googling, it seems that MySQL doesn't include it. To "simulate it", as you put it,

    1) To ensure case-sensitivity in the DB: set the table column to utf8_bin collation
    This allows:

    • strict SELECTs: SELECT "Joe" will NOT return rows with "joe" / "joE" / "jOe" / etc
    • strict UNIQUE index: a column with a UNIQUE index will treat case differences as different values. For example, if a utf8_unicode_ci collation is used, inserting "Joe" on a table that already has "joe" will trigger a "Duplicate key" error. If ut8_bin is used, inserting "Joe" will work fine.

    2) To get the proper ordering in results: add the collation to the SQL query:

    SELECT ... ORDER BY column COLLATE utf8_unicode_ci
    
    0 讨论(0)
  • 2021-01-02 04:23

    This is an old question but does not seem to be superseded by any other, so I thought it worth posting that things have changed.

    MySQL version 8 now has the following collations for utf8mb4:

     utf8mb4_0900_ai_ci 
     utf8mb4_0900_as_ci
     utf8mb4_0900_as_cs
     ... and many language-specific variants of same.
    

    (no _ai_cs as far as I know, but that would in any case be less useful: few reasons to group [a] and [a-acute] and then separately group [A] and [A-acute]).

    The purpose of the original question's hypothetical "utf8_unicode_cs" is fulfilled by utf8mb4_0900_as_cs. (The 0900 means it uses Unicode v 9.0.0 as opposed to 4.0.0 used by utf8_unicode_ci.)

    To use these you'd need to change the field from utf8 to utf8mb4 character set - but that's a generally good idea anyway because the old 3-byte-max encoding can't handle e.g. emoji and other non-BMP characters.

    Source: https://dev.mysql.com/doc/refman/8.0/en/charset-mysql.html

    0 讨论(0)
提交回复
热议问题