iconv UTF-8//IGNORE still produces “illegal character” error

╄→гoц情女王★ 提交于 2019-12-18 19:01:12

问题


$string = iconv("UTF-8", "UTF-8//IGNORE", $string);

I thought this code would remove invalid UTF-8 characters, but it produces [E_NOTICE] "iconv(): Detected an illegal character in input string". What am I missing, how do I properly strip a string from illegal characters?


回答1:


The output character set (the second parameter) should be different from the input character set (first param). If they are the same, then if there are illegal UTF-8 characters in the string, iconv will reject them as being illegal according to the input character set.




回答2:


I know 2 methods how to fix UTF-8 string containing illegal characters:

  1. Illegal characters will be replaced by question marks ("?"):

$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');

  1. Illegal characters will be removedL

$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);

The second method actually was described in question. But it doesn't produce any E_NOTICE in my case. I tested with different corrupted UTF-8 strings with error_reporting(E_ALL); and always result was as expected. Possible something was changed since 2012. I tested on PHP 7.2.9 Win.




回答3:


To simply ignore notice, you can use "@":

$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);



来源:https://stackoverflow.com/questions/9375909/iconv-utf-8-ignore-still-produces-illegal-character-error

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!