问题
I want to convert html entities to UTF-8, but mb_convert_encoding
destroys already UTF-8 encoded characters. Whats the correct way?
$text = "äöü ä ö ü ß";
var_dump(mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES'));
// string(24) "äöü ä ö ü ß"
回答1:
mb_convert_encoding()
isn't the correct function for what you're trying to achieve: you should really be using html_entity_decode() instead, because it will only convert the actual html entities to UTF-8, and won't affect the existing UTF-8 characters in the string.
$text = "äöü ä ö ü ß";
var_dump(html_entity_decode($text, ENT_COMPAT | ENT_HTML401, 'UTF-8'));
which gives
string(18) "äöü ä ö ü ß"
Demo
回答2:
In my localhost I get string(18) "äöü ä ö ü ß"
.
I think it's something related with your page encoding. Edit the file with Notepad++ and from the toolbar go to encoding and change to 'Encode in ANSI'. If it doesn't work then try with 'Encode in UTF-8 without BOM'.
回答3:
and if that still isn't working try this
html_entity_decode($html, ENT_QUOTES, 'cp1252');
This is what was needed on a Windows IIS system for things to start working correctly. see source
来源:https://stackoverflow.com/questions/31338277/convert-html-entities-to-utf-8-but-keep-existing-utf-8