htmlspecialchars(): Invalid multibyte sequence in argument

谁说我不能喝 提交于 2019-11-29 01:29:07
Tatu Ulmanen

Be sure to specify the encoding to UTF-8 if your files are encoded as such:

htmlspecialchars($str, ENT_COMPAT, 'UTF-8');

The default charset for htmlspecialchars is ISO-8859-1 (as of PHP v5.4 the default charset was turned to 'UTF-8'), which might explain why things go haywire when it meets multibyte characters.

I ran in to this error on production and found this great post about it -

http://insomanic.me.uk/post/191397106/php-htmlspecialchars-htmlentities-invalid

It appears to be a bug in PHP (for CentOS at least) that displays this error on when display errors is Off!

You are feeding corrupted character data into the function, or not specifying the right encoding.

I had this issue a while ago, old behavior (prior to PHP 5.2.7 I believe) was to return the string despite corruption, but since that version it will throw this error instead.

My solution involved writing a script to feed my strings through iconv using the //IGNORE modifier to remove corrupted data.

(We had a corrupted database which had some strings in UTF-8, some in latin-1 usually with incorrectly defined character types on the columns).

(Looking at the comment to Tatu's answer, I would start by looking at (and playing with) the contents of the $charset variable.

The correct code in order not to get any error is:

htmlentities($string, ENT_IGNORE, 'UTF-8') ;

Beside this you can also use str_replace to replace some bad characters to your needs and then use htmlentities function.

Have a look at this rss feed it replaced the greater html sign to gt; tag which might not look nice when reading thee rss feed. You can replace this with something like "-" sign or ")" and etc.

Had the same problem because I was using substr on utf-8 string.
Error was infrequent and seemingly random. Error occurred only if string was cut on multibyte char!

mb_substr solved the problem :)

That's actually one of the most frequent errors I get.

Sometimes I dont use __() translation - just plain German text containing äöü. There it is especially important to mind the encoding of the files.

So make sure you properly save the files that contain special chars as UTF8.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!