Removing DocDocument warning while parsing page content

前端 未结 1 1584
灰色年华
灰色年华 2020-12-22 04:38

I am trying to parse the content of any url. Which should not content any html code. This works fine, but gives bunch of error while reading the content on url given. How to

相关标签:
1条回答
  • 2020-12-22 05:28

    You can use libxml_use_internal_errors() and do the following:

    libxml_use_internal_errors(true);
    $doc->loadHTMLFile($url);
    libxml_clear_errors();
    

    As Peehaa noted in the comments below, it's a good idea to reset the state of errors. You can do it as below:

    $errors = libxml_use_internal_errors(true); //store
    $doc->loadHTMLFile($url);
    libxml_clear_errors();
    libxml_use_internal_errors($errors); //reset back to previous state
    

    Here's how it works:

    • libxml_use_internal_errors() tells libxml to handle the errors and warnings internally, and that it shouldn't be outputted to the browser. Also store the current state of errors in a variable
    • then you load the HTML file with loadHTML() method
    • clear the error buffer with libxml_clear_errors
    • restores the old state of error values

    Demo!

    0 讨论(0)
提交回复
热议问题