I\'m trying to rewrite an old website .
it\'s in persian which uses perso/arabic characters .
CREATE DATABASE `db
In short, because this has been discussed a thousand times before:
"漢字"
, encoded in UTF-8. The bytes for this are E6 BC A2 E5 AD 97
.latin1
.E6 BC A2 E5 AD 97
, thinking those represent latin1
characters.æ¼¢å
(the characters that E6 BC A2 E5 AD 97
maps to in latin1
).So the problem here was that the database connection was set incorrectly when the data was entered into the database. You'll have to convert the data in the database to the correct characters. Try this:
SELECT CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) FROM table_name
Maybe utf8
isn't what you need here, experiment. If that works, change this into an UPDATE
statement to update the data permanently.
The deceze's answer is excellent but I can add some info that may help to handle lots of records without test them manually.
If the conversion CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8)
fails, it print NULL
instead of the field_name
content.
So I use this one to find those record:
SELECT IFNULL(
CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8)
, '**************************************************')
FROM table_name
or this one:
SELECT id, field_name, CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8)
FROM table_name
WHERE CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) IS NULL
And the UPDATE
with the clause to affect only records on which the conversion success:
UPDATE table_name
SET
field_name = CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8mb4 )
WHERE
CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8mb4) IS NOT NULL