问题
I started working on a wordpress on my dev machine. mysql version is 5.6, and worpdress is 4.7 so its already using the utf8mb4_unicode_520_ci encoding if it detects its possible.
My problem is that on my hosting (mysql 5.5) utf8mb4_unicode_520_ci is not recognized as a valid encoding. So I'm trying to target utf8mb4_unicode_ci encoding as my hosting knows about this one, and if I understand correctly, this would - in opposition to going to utf8 - allow me to keep the 4 bytes.
I tried several different combinaison of encoding and collation set up for the db, but nothing successful (from here How to convert an entire MySQL database characterset and collation to UTF-8?).
I tried several combination of encoding and collation in the wp-config, but nothing.
Everything that is coming from the database (like post titles and post contents displays badly encoded char for all diatrics, anything else is displayed appropriately )
menu label from the database display incorrectly, where the hardcoded/translated label display correctly
I think I need to convert the actual content of the database, changing charset and collation does not seems to be enough.
I found this but it does not address my problem directly, or if it does I missed it.
Any help would be appreciated
————————————————————————————————
UPDATE :
here is the precise procedure I went through:
Initial situation:
I installed a wordpress (4.6.1) locally (on my dev machine, mysql 5.6.28). I worked on the theme and plugin locally
(at this moment I have, locally, a database that is utf8_general_ci
and tables that are utf8mb4_unicode_520_ci
Problem:
I want to deploy my wordpress on my hosting (mysql: 5.5 - db collation seems to be utf8mb4_unicode_ci
).
I mysqldump the db locally, then try to import it on my hostings' phpmyadmin.
This gives error :
Unknown collation: 'utf8mb4_unicode_520_ci'
solution 1 change the tables charset to utf8mb4_unicode_ci:
On my hosting sql server, utf8mb4_unicode_520_ci is not available and I can't get a more recent version of mysql.
utf8mb4_unicode_ci seems like the closest and is available on my hosting sql server.
from various so question, I adapt a bash script to change charset and collation of my tables
for tbl in wp_sij2017_commentmeta wp_sij2017_comments wp_sij2017_cwa wp_sij2017_links wp_sij2017_options wp_sij2017_postmeta wp_sij2017_posts wp_sij2017_term_relationships wp_sij2017_term_taxonomy wp_sij2017_termmeta wp_sij2017_terms wp_sij2017_usermeta wp_sij2017_users wp_sij2017_woocommerce_api_keys wp_sij2017_woocommerce_attribute_taxonomies wp_sij2017_woocommerce_downloadable_product_permissions wp_sij2017_woocommerce_order_itemmeta wp_sij2017_woocommerce_order_items wp_sij2017_woocommerce_payment_tokenmeta wp_sij2017_woocommerce_payment_tokens wp_sij2017_woocommerce_sessions wp_sij2017_woocommerce_shipping_zone_locations wp_sij2017_woocommerce_shipping_zone_methods wp_sij2017_woocommerce_shipping_zones wp_sij2017_woocommerce_tax_rate_locations wp_sij2017_woocommerce_tax_rates; do
mysql --execute="ALTER TABLE wp_sij_2017_original_copy.${tbl} CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"
done
I run this script on the local db
I now have all my tables set to collation utf8mb4_unicode_ci
My db collation is still utf8
I mysqldump the db, then import it to my hosting and...
Import is successful.
I search and replace siteurl in the db. I then visit the online website, I got SOME diatrics that renders a "question mark char"
Any text coming from the db has decoding issue AT SOME POINT
The source/html markup also has those "question mark char"
I have no idea where to look or what to do next
回答1:
Clarification: CHARACTER SETs
utf8
and utf8mb4
specify how characters are encoded into bytes. COLLATIONs
*_unicode_*
, etc, specify how those character compare.
The encoding for utf8mb4_unicode_ci
and utf8mb4_unicode_520_ci
are the same because they are encoded in the character set utf8mb4.
"database that is utf8_general_ci and tables that are utf8mb4_unicode_520_ci" -- that probably means that new tables in that database, unless specifically stated, will be CHARACTER SET utf8 COLLATION utf8_general_ci
. That is the database setting is just a default for CREATE TABLE
. Since your tables are already CHARACTER SET utf8mb4 COLLATION utf8mb4_unicode_520_ci
, the database default is not relevant to them.
As long as the CHARACTER SET
stays utf8mb4
, no Emoji, Chinese, etc will be lost or otherwise mangled.
Do not use mysql40
; it did not know about any CHARACTER SETs
. Do not use CONVERT
or CAST
. Etc.
I assume the 520
is coming from the output of mysqldump
? Do you have an editor that can handle a file that big? If so, simply edit it to change utf8mb4_unicode_520_ci
to utf8mb4_unicode_ci
throughout. Then load the dump. Problem solved?
Your fix
You did ALTER ... CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
on your local machine. That is probably an even better way -- since it will put your dev and prod machine in line with each other. That should have worked. Don't worry about what the "database" claims.
回答2:
I'm find 'utf8mb4_unicode_520_ci' and replace with 'utf8mb4_unicode_ci' in .sql file. Its simplest why to solve this.
来源:https://stackoverflow.com/questions/41850752/issue-when-deploying-mysql-db-utf8mb4-unicode-520-ci-utf8mb4-unicode-ci