PHP HTML DomDocument getElementById problems

后端 未结 2 1382
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-27 07:40

A little new to PHP parsing here, but I can\'t seem to get PHP\'s DomDocument to return what is clearly an identifiable node. The HTML loaded will come from the \'net so ca

相关标签:
2条回答
  • 2020-11-27 07:47

    Well, you should check if $dom->loadHTML($html); returns true (success) and I would try

     var_dump($belement->nodeValue);
    

    for output to get a clue what might be wrong.

    EDIT: http://www.php-editors.com/php_manual/function.domdocument-get-element-by-id.html - it seems that DomDocument uses XPath internally.

    Example:

    $xpath = xpath_new_context($dom);
    var_dump(xpath_eval_expression($xpath, "//*[@ID = 'YOURIDGOESHERE']"));
    
    0 讨论(0)
  • 2020-11-27 08:08

    The Manual explains why:

    For this function to work, you will need either to set some ID attributes with DOMElement->setIdAttribute() or a DTD which defines an attribute to be of type ID. In the later case, you will need to validate your document with DOMDocument->validate() or DOMDocument->validateOnParse before using this function.

    By all means, go for valid HTML & provide a DTD.

    Quick fixes:

    1. Call $dom->validate(); and put up with the errors (or fix them), afterwards you can use $dom->getElementById(), regardless of the errors for some reason.
    2. Use XPath if you don't feel like validing: $x = new DOMXPath($dom); $el = $x->query("//*[@id='bid']")->item(0);
    3. Come to think of it: if you just set validateOnParse to true before loading the HTML, if would also work ;P

    .

    $dom = new DOMDocument();
    $html ='<html>
    <body>Hello <b id="bid">World</b>.</body>
    </html>';
    $dom->validateOnParse = true; //<!-- this first
    $dom->loadHTML($html);        //'cause 'load' == 'parse
    
    $dom->preserveWhiteSpace = false;
    
    $belement = $dom->getElementById("bid");
    echo $belement->nodeValue;
    

    Outputs 'World' here.

    0 讨论(0)
提交回复
热议问题