问题
I'm getting crazy over these encoding probs...
I use json_decode
and json_encode
to store and retrieve data. What I did find out is, that json always needs utf-8. No problem there. I give json 'hellö' in utf-8, in my DB it looks like hellu00f6
. Ok, codepoint. But when I use json_decode
, it won't decode the codepoint back, so I still have hellu00f6
.
Also, in php 5.2.13 it seems like there are still no optionial tags in JSON. How can I convert the codepoint caracters back to the correct specialcharacter for display in the browser?
Greetz and thanks
Maenny
回答1:
It could be because of the backslash preceding the codepoint in the JSON unicode string: ö
is represented \u00f6
. When stored in your DB, the DBMS doesn't knows how to interpret \u00f6
so I guess it reads (and store) it as u00f6
.
Are you using an escaping function ?
Try adding a backslash on unicode-escaped chars:
$json = str_replace("\\u", "\\\\u", $json);
回答2:
The preceding post already explains, why your example did not work as expected. However, there are some good coding practices when working with databases, which are important to improve the security of your application (i.e. prevent SQL-injection).
The following example intends to show some of these practices, and assumes PHP 5.2 and MySQL 5.1. (Note that all files and database entries are stored using UTF-8 encoding.)
The database used in this example is called test
, and the table was created as follows:
CREATE TABLE `test`.`entries` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`data` VARCHAR( 100 ) NOT NULL
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_bin
(Note that the encoding is set to utf8_bin
.)
It follows the php code, which is used for both, adding new entries and creating JSON:
<?
$conn = new PDO('mysql:host=localhost;dbname=test','root','xxx');
$conn->exec("SET NAMES 'utf8'"); // Enable UTF-8 charset for db-communication ..
if(isset($_GET['add_entry'])) {
header('Content-Type: text/plain; charset=UTF-8');
// Add new DB-Entry:
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {
$id = $conn->lastInsertId();
echo 'Created entry '.$id.': '.$_GET['add_entry'];
} else {
$info = $conn->errorInfo();
echo 'Unable to create entry: '. $info[2];
}
} else {
header('Content-Type: text/json; charset=UTF-8');
// Output DB-Entries as JSON:
$entries = array();
if($res = $conn->query('SELECT * FROM `entries`')) {
$res->setFetchMode(PDO::FETCH_ASSOC);
foreach($res as $row) {
$entries[] = $row;
}
}
echo json_encode($entries);
}
?>
Note the usage of the method $conn->quote(..)
before passing data to the database. As mentioned in the preceding post, it would even be better to use prepared statements, since they already do the whole escaping. Thus, it would be better if we write:
$prepStmt = $conn->prepare('INSERT INTO `entries` (`data`) VALUES (:data)');
if($prepStmt->execute(array('data'=>$_GET['add_entry']))) {...}
instead of
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {...}
Conclusion: Using UTF-8 for all character data stored or transmitted to the user is reasonable. It makes the development of internationalized web applications way easier. To make sure, user-input is properly sent to the database, using an escape function is a good idea. Otherwise, using prepared statements make life and development even easier and furthermore improves your applications security, since SQL-Injection is prevented.
来源:https://stackoverflow.com/questions/3238855/json-specialchars-json-php-5-2-13