问题
$string = iconv("UTF-8", "UTF-8//IGNORE", $string);
I thought this code would remove invalid UTF-8 characters, but it produces [E_NOTICE] "iconv(): Detected an illegal character in input string"
. What am I missing, how do I properly strip a string from illegal characters?
回答1:
The output character set (the second parameter) should be different from the input character set (first param). If they are the same, then if there are illegal UTF-8 characters in the string, iconv
will reject them as being illegal according to the input character set.
回答2:
I know 2 methods how to fix UTF-8 string containing illegal characters:
- Illegal characters will be replaced by question marks ("?"):
$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');
- Illegal characters will be removedL
$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);
The second method actually was described in question. But it doesn't produce any E_NOTICE
in my case. I tested with different corrupted UTF-8 strings with error_reporting(E_ALL);
and always result was as expected. Possible something was changed since 2012. I tested on PHP 7.2.9 Win.
回答3:
To simply ignore notice, you can use "@":
$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);
来源:https://stackoverflow.com/questions/9375909/iconv-utf-8-ignore-still-produces-illegal-character-error