Convert HTML entities and special characters to UTF8 text in PHP

只谈情不闲聊 提交于 2019-12-08 05:57:43

问题


There are a lot of questions and documentation about converting HTML entities and special characters to UTF8 text in PHP. And also there is the PHP documentation itself, such as this htmlspecialchars_decode() and this html_entity_decode(). However, I could not find any function/solution that clearly describes how to convert any HTML characters and special entities to UTF-8 text. All of them state something like "if you want to do this, then do that", etc. But no solution ever states "to have pure UTF-8 text that could be read by humans, then do this".

The reason for me asking, is I really don't have a test case. I am reading off a database, and it is multilingual. However the only guarantee is that the characters are in HTML, and I need to convert those to UTF-8, in a way that can be read by humans who understand those languages. Now, how can I do that? What is the proper way to sanitize/decode the input so it is pure text?

Thanks.


Update

Here is an update, as it is clear from the comments I was not asking the question properly. My DB contains text. I would like to convert that text (which contains HTML entities and special characters), to UTF-8 text that I can display to the end user on the webpage. This text in the databae is written in multiple languages (such as French, Arabic, English ...etc.). All those can contains HTML entities for special characters. So how can I convert all that to UTF-8 text that can be read by humans who understand those languages? I like to remove those special characters and convert them to something that can be read by humans.


回答1:


This works for me for decoding entities to utf8:

html_entity_decode($str, ENT_QUOTES | ENT_HTML5, 'UTF-8');

Edit:-- The "trick" to it is the combination in the second parameter, and including the encoding in the third parameter. That is, if you just did html_entity_decode($str); the result would not be utf8.



来源:https://stackoverflow.com/questions/25372068/convert-html-entities-and-special-characters-to-utf8-text-in-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!