PHP/PDO/MySQL: inserting into MEDIUMBLOB stores bad data

后端 未结 1 1069
小鲜肉
小鲜肉 2020-12-02 01:15

I have a simple PHP web app that accepts icon images via file upload and stores them in a MEDIUMBLOB column.

On my machine (Windows) plus two Linux servers, this wo

相关标签:
1条回答
  • 2020-12-02 01:45

    This seems like a bug to me: why should the charset of the connection have any effect on data for a binary column, particularly when it's been identified as binary to PDO itself with PARAM_LOB?

    I do not think that this must be a bug. I can imagine that whenever the client talks with the server and says that the following command is in UTF-8 and the server needs it in Latin-1, then the query might get re-encoded prior parsing and execution. So this is an encoding issue for the transportation of the data. As the whole query prior parsing will get influenced by this re-encoding, the binary data for the BLOB column will get changed as well.

    From the Mysql manual:

    What character set should the server translate a statement to after receiving it?

    For this, the server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8). collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.

    Or on the way back: Latin1 data from the store will get converted into UTF-8 because the client told the server that it prefers UTF-8 for the transportation.

    The identifier for PDO itself you name looks like being something entirely different:

    PDO::PARAM_LOB tells PDO to map the data as a stream, so that you can manipulate it using the PHP Streams API. (Ref)

    I'm no MySQL expert but I would explain it this way. Client and server need to negotiate which charsets they are using and I assume they do this for a reason.

    0 讨论(0)
提交回复
热议问题