php problem with russian language

前端 未结 3 1558
野趣味
野趣味 2021-01-03 10:45

i get page in utf-8 with russian language using curl. if i echo text it show good. then i use such code

$dom = new domDocument; 

        /*** load the html          


        
相关标签:
3条回答
  • 2021-01-03 11:42

    I suggest use mb_convert_encoding before load UTF-8 page.

        $dom = new DomDocument();   
        $html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
        @$dom->loadHTML($html);
    

    OR else you could try this

        $dom = new DomDocument('1.0', 'UTF-8');
        @$dom->loadHTML($html);
        $dom->preserveWhiteSpace = false;
        ..........
        echo html_entity_decode($cols->item(2)->nodeValue,ENT_QUOTES,"UTF-8");
        .......... 
    
    0 讨论(0)
  • 2021-01-03 11:42

    The DOM cannot recognize the HTML's encoding. You can try something like:

    $doc = new DOMDocument();
    $doc->loadHTML('<?xml encoding="UTF-8">' . $html);
    
    // taken from http://php.net/manual/en/domdocument.loadhtml.php#95251
    
    0 讨论(0)
  • 2021-01-03 11:51

    mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");

    The same thing worked for PHPQuery.

    P.S. I use phpQuery::newDocument($html);

    instead of $dom->loadHTML($html);

    0 讨论(0)
提交回复
热议问题