Is there any reasons why PHP\'s json_encode function does not escape all JSON control characters in a string?
For example let\'s take a string which spans two rows a
Control characters have no special meaning in HTML except for new line in textarea.value . JSON_encode on PHP > 5.2 will do it like you expected.
If you just want to show text you don't need to go after JSON. JSON is for arrays and objects in JavaScript (and indexed and associative array for PHP).
If you need a line feed for the texarea-tag:
$s=preg_replace('/\r */','',$s);
echo preg_replace('/ *\n */',' ',$s);
$search = array("\n", "\r", "\u", "\t", "\f", "\b", "/", '"');
$replace = array("\\n", "\\r", "\\u", "\\t", "\\f", "\\b", "\/", "\"");
$encoded_string = str_replace($search, $replace, $json);
This is the correct way
Converting to and fro from PHP should not be an issue. PHP's json_encode does proper encoding but reinterpreting that inside java script can cause issues. Like
1) original string - [string with nnn newline in it] (where nnn is actual newline character)
2) json_encode will convert this to [string with "\\n" newline in it] (control character converted to "\\n" - Literal "\n"
3) However when you print this again in a literal string using php echo then "\\n" is interpreted as "\n" and that causes heartache. Because JSON.parse will understand a literal printed "\n" as newline - a control character (nnn)
so to work around this: -
A) First encode the json object in php using json_enocde and get a string. Then run it through a filter that makes it safe to be used inside html and java script.
B) use the JSON string coming from PHP as a "literal" and put it inside single quotes instead of double quotes.
<?php
function form_safe_json($json) {
$json = empty($json) ? '[]' : $json ;
$search = array('\\',"\n","\r","\f","\t","\b","'") ;
$replace = array('\\\\',"\\n", "\\r","\\f","\\t","\\b", "'");
$json = str_replace($search,$replace,$json);
return $json;
}
$title = "Tiger's /new \\found \/freedom " ;
$description = <<<END
Tiger was caged
in a Zoo
And now he is in jungle
with freedom
END;
$book = new \stdClass ;
$book->title = $title ;
$book->description = $description ;
$strBook = json_encode($book);
$strBook = form_safe_json($strBook);
?>
<!DOCTYPE html>
<html>
<head>
<title> title</title>
<meta charset="utf-8">
<script type="text/javascript" src="/3p/jquery/jquery-1.7.1.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
var strBookObj = '<?php echo $strBook; ?>' ;
try{
bookObj = JSON.parse(strBookObj) ;
console.log(bookObj.title);
console.log(bookObj.description);
$("#title").html(bookObj.title);
$("#description").html(bookObj.description);
} catch(ex) {
console.log("Error parsing book object json");
}
});
</script>
</head>
<body>
<h2> Json parsing test page </h2>
<div id="title"> </div>
<div id="description"> </div>
</body>
</html>
Put the string inside single quote in java script. Putting JSON string inside double quotes would cause the parser to fail at attribute markers (something like { "id" : "value" } ). No other escaping should be required if you put the string as "literal" and let JSON parser do the work.
When using any form of Ajax, detailed documentation for the format of responses received from the CGI server seems to be lacking on the Web. Some Notes here and entries at stackoverflow.com point out that newlines in returned text or json data must be escaped to prevent infinite loops (hangs) in JSON conversion (possibly created by throwing an uncaught exception), whether done automatically by jQuery or manually using Javascript system or library JSON parsing calls.
In each case where programmers post this problem, inadequate solutions are presented (most often replacing \n by \\n on the sending side) and the matter is dropped. Their inadequacy is revealed when passing string values that accidentally embed control escape sequences, such as Windows pathnames. An example is "C:\Chris\Roberts.php", which contains the control characters ^c and ^r, which can cause JSON conversion of the string {"file":"C:\Chris\Roberts.php"} to loop forever. One way of generating such values is deliberately to attempt to pass PHP warning and error messages from server to client, a reasonable idea.
By definition, Ajax uses HTTP connections behind the scenes. Such connections pass data using GET and POST, both of which require encoding sent data to avoid incorrect syntax, including control characters.
This gives enough of a hint to construct what seems to be a solution (it needs more testing): to use rawurlencode on the PHP (sending) side to encode the data, and unescape on the Javascript (receiving) side to decode the data. In some cases, you will apply these to entire text strings, in other cases you will apply them only to values inside JSON.
If this idea turns out to be correct, simple examples can be constructed to help programmers at all levels solve this problem once and for all.
There are 2 solutions unless AJAX is used:
Write data into input like and read it in JS:
<input type="hidden" value="<?= htmlencode(json_encode($data)) ?>"/>
Use addslashes
var json = '<?= addslashes(json_encode($data)) ?>';
D'oh - you need to double-encode: JSON.parse is expecting a string of course:
<script type="text/javascript">
JSON.parse(<?php echo json_encode($s) ?>);
</script>