Error message on MySql:
Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation \'=\'
I have gone
Despite finding an enormous number of question about the same problem (1, 2, 3, 4) I have never found an answer that took performance into consideration, even here.
Although multiple working solutions has been already given I would like to do a performance consideration.
EDIT: Thanks to Manatax for pointing out that option 1 does not suffer of performance issues.
Using Option 1 and 2, aka the COLLATE cast approach, can lead to potential bottleneck, cause any index defined on the column will not be used causing a full scan.
Even though I did not try out Option 3, my hunch is that it will suffer the same consequences of option 1 and 2.
Lastly, Option 4 is the best option for very large tables when it is viable. I mean there are no other usage that rely on the original collation.
Consider this simplified query:
SELECT
*
FROM
schema1.table1 AS T1
LEFT JOIN
schema2.table2 AS T2 ON T2.CUI = T1.CUI
WHERE
T1.cui IN ('C0271662' , 'C2919021')
;
In my original example, I had many more joins. Of course, table1 and table2 have different collations. Using the collate operator to cast, it will lead to indexes not being used.
See sql explanation in the picture below.
Visual Query Explanation when using the COLLATE cast
On the other hand, option 4 can take advantages of possible index and led to fast queries.
In the picture below, you can see the same query being run after applied Option 4, aka altering the schema/table/column collation.
Visual Query Explanation after the collation has been changed, and therefore without the collate cast
In conclusion, if performance are important and you can alter the collation of the table, go for Option 4. If you have to act on a single column, you can use something like this:
ALTER TABLE schema1.table1 MODIFY `field` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
I had a similar problem, but it occurred to me inside procedure, when my query param was set using variable e.g. SET @value='foo'
.
What was causing this was mismatched collation_connection
and Database collation. Changed collation_connection
to match collation_database
and problem went away. I think this is more elegant approach than adding COLLATE after param/value.
To sum up: all collations must match. Use SHOW VARIABLES
and make sure collation_connection
and collation_database
match (also check table collation using SHOW TABLE STATUS [table_name]
).