Manipulating utf8mb4 data from MySQL with PHP

限于喜欢 提交于 2019-12-01 05:39:15

I'd simply guess that you are setting the table to utf8mb4, but your connection encoding is set to utf8. You have to set it to utf8mb4 as well, otherwise MySQL will convert the stored utf8mb4 data to utf8, the latter of which cannot encode "high" Unicode characters. (Yes, that's a MySQL idiosyncrasy.)

On a raw MySQL connection, it will have to look like this:

SET NAMES 'utf8mb4';
SELECT * FROM `my_table`;

You'll have to adapt that to the best way of the client, depending on how you connect to MySQL from PHP (mysql, mysqli or PDO).


To really clarify (yes, using the mysql_ extension for simplicity, don't do that at home):

mysql_connect(...);
mysql_select_db(...);
mysql_set_charset('utf8mb4');     // adapt to your mysql connector of choice

$r = mysql_query('SELECT * FROM `my_table`');

var_dump(mysql_fetch_assoc($r));  // data will be UTF8 encoded

Just to add to @deceze's answer, I recommend a well-configured MySQL server (for me, in /etc/mysql/mysql.conf.d/mysqld.cnf). Here are the configuration options to make sure you're using utfmb4, although I do recommend going through every MySQL configuration option though, daunting as it is, there are a lot of defaults that are are very non-optimal.

[client]

default-character-set           = utf8mb4

[mysql]

default_character_set           = utf8mb4

[mysqld]

init-connect                    = "SET NAMES utf8mb4"
character-set-client-handshake  = FALSE
character-set-server            = "utf8mb4"
collation-server                = "utf8mb4_unicode_ci"
autocommit                      = 1
block_encryption_mode           = "aes-256-cbc"

That last one is just one that should be default. Also, init-connect deals with not having to execute that every time. Keeps code clean. Now run:

SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

You should return something like the following:

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_unicode_ci |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8mb4_unicode_ci |
+--------------------------+--------------------+

And looks like you're doing this already, but doesn't hurt to explicitly define on table creation:

CREATE TABLE `mysql_table` (
  `mysql_column` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`mysql_column`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4;

Hope this helps someone.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!