How get first level of dom elements by Domdocument?

后端 未结 1 1507
轻奢々
轻奢々 2020-12-28 18:22

How get first level of dom elements by Domdocument PHP?

Example with code that not works - tooken from Q&A:How to get nodes in first level using PHP DOMDocument?<

相关标签:
1条回答
  • 2020-12-28 19:06

    The first level of elements below the root node can be accessed with

    $dom->documentElement->childNodes
    

    The childNodes property contains a DOMNodeList, which you can iterate with foreach.

    See DOMDocument::documentElement

    This is a convenience attribute that allows direct access to the child node that is the document element of the document.

    and DOMNode::childNodes

    A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.

    Since childNodes is a property of DOMNode any class extending DOMNode (which is most of the classes in DOM) have this property, so to get the first level of elements below a DOMElement is to access that DOMElement's childNode property.


    Note that if you use DOMDocument::loadHTML() on invalid HTML or partial documents, the HTML parser module will add an HTML skeleton with html and body tags, so in the DOM tree, the HTML in your example will be

    <!DOCTYPE html … ">
    <html><body><div id="header">
    </div>
    <div id="content">
        <div id="sidebar">
        </div>
        <div id="info">
        </div>
    </div>
    <div id="footer">
    </div></body></html>
    

    which you have to take into account when traversing or using XPath. Consequently, using

    $dom = new DOMDocument;
    $dom->loadHTML($str);
    foreach ($dom->documentElement->childNodes as $node) {
        echo $node->nodeName; // body
    }
    

    will only iterate the <body> DOMElement node. Knowing that libxml will add the skeleton, you will have to iterate over the childNodes of the <body> element to get the div elements from your example code, e.g.

    $dom->getElementsByTagName('body')->item(0)->childNodes
    

    However, doing so will also take into account any whitespace nodes, so you either have to make sure to set preserveWhiteSpace to false or query for the right element nodeType if you only want to get DOMElement nodes, e.g.

    foreach ($dom->getElementsByTagName('body')->item(0)->childNodes as $node) {
        if ($node->nodeType === XML_ELEMENT_NODE) {
            echo $node->nodeName;
        }
    }
    

    or use XPath

    $dom->loadHTML($str);
    $xpath = new DOMXPath($dom);
    foreach ($xpath->query('/html/body/*') as $node) {
        echo $node->nodeName;
    }
    

    Additional information:

    • DOMDocument in php
    • Printing content of a XML file using XML DOM
    0 讨论(0)
提交回复
热议问题