PHP DOMDocument loadHTML not encoding UTF-8 correctly

后端 未结 13 1554
梦如初夏
梦如初夏 2020-11-22 15:11

I\'m trying to parse some HTML using DOMDocument, but when I do, I suddenly lose my encoding (at least that is how it appears to me).

$profile = \"

        
13条回答
  •  失恋的感觉
    2020-11-22 15:19

    The only thing that worked for me was the accepted answer of

    $profile = '

    イリノイ州シカゴにて、アイルランド系の家庭に、9

    '; $dom = new DOMDocument(); $dom->loadHTML('' . $profile); echo $dom->saveHTML();

    HOWEVER

    This brought about new issues, of having in the output of the document.

    The solution for me was then to do

    foreach ($doc->childNodes as $xx) {
        if ($xx instanceof \DOMProcessingInstruction) {
            $xx->parentNode->removeChild($xx);
        }
    }
    

    Some solutions told me that to remove the xml header, that I had to perform

    $dom->saveXML($dom->documentElement);
    

    This didn't work for me as for a partial document (e.g. a doc with two

    tags), only one of the

    tags where being returned.

提交回复
热议问题