php problem with russian language

断了今生、忘了曾经 提交于 2019-11-30 16:06:22
Asif Mulla

I suggest use mb_convert_encoding before load UTF-8 page.

    $dom = new DomDocument();   
    $html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
    @$dom->loadHTML($html);

OR else you could try this

    $dom = new DomDocument('1.0', 'UTF-8');
    @$dom->loadHTML($html);
    $dom->preserveWhiteSpace = false;
    ..........
    echo html_entity_decode($cols->item(2)->nodeValue,ENT_QUOTES,"UTF-8");
    .......... 

The DOM cannot recognize the HTML's encoding. You can try something like:

$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="UTF-8">' . $html);

// taken from http://php.net/manual/en/domdocument.loadhtml.php#95251

mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");

The same thing worked for PHPQuery.

P.S. I use phpQuery::newDocument($html);

instead of $dom->loadHTML($html);

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!