DOM in PHP: Decoded entities and setting nodeValue

前端 未结 2 518
刺人心
刺人心 2021-01-19 08:21

I want to perform certain manipulations on a XML document with PHP using the DOM part of its standard library. As others have already discovered, one has to deal with decode

2条回答
  •  逝去的感伤
    2021-01-19 08:31

    As hakre explained, the problem is that in PHP's DOM library, the behaviour of setting nodeValue w.r.t. entities depends on the class of the node, in particular DOMText and DOMElement differ in this regard. To illustrate this, an example:

    $doc = new DOMDocument();
    $doc->formatOutput = True;
    $doc->loadXML('');
    
    $s = 'text &<<"\'&text;&text';
    
    $root = $doc->documentElement;
    
    $node = $doc->createElement('tag1', $s); #line 10
    $root->appendChild($node);
    
    $node = $doc->createElement('tag2');
    $text = $doc->createTextNode($s);
    $node->appendChild($text);
    $root->appendChild($node);
    
    $node = $doc->createElement('tag3');
    $text = $doc->createCDATASection($s);
    $node->appendChild($text);
    $root->appendChild($node);
    
    echo $doc->saveXML();
    

    outputs

    Warning: DOMDocument::createElement(): unterminated entity reference            text in /tmp/DOMtest.php on line 10
    
    
      text &<<"'&text;
      text &amp;&lt;<"'&text;&text
      
    
    

    In this particular case, it is appropriate to alter the nodeValue of DOMText nodes. Combining hakre's two answers one gets a quite elegant solution.

    $doc = new DOMDocument();
    $doc->loadXML();
    
    $xpath     = new DOMXPath($doc);
    $node_list = $xpath->query();
    
    $visitTextNode = function (DOMText $node) {
        $text = $node->textContent;
        /*
            do something with $text
        */
       $node->nodeValue = $text;
    };
    
    foreach ($node_list as $node) {
        if ($node->nodeType == XML_TEXT_NODE) {
            $visitTextNode($node);
        } else {
            foreach ($node->childNodes as $child) {
                if ($child->nodeType == XML_TEXT_NODE) {
                    $visitTextNode($child);
                }
            }
        }
    }
    

提交回复
热议问题