Weird error using PHP Simple HTML DOM parser

前端 未结 9 1355
梦谈多话
梦谈多话 2020-11-29 10:30

I am using this library (PHP Simple HTML DOM parser) to parse a link, here\'s the code:

function getSemanticRelevantKeywords($keyword){
    $results = array(         


        
相关标签:
9条回答
  • 2020-11-29 10:55

    Error means, the find() function is either not defined yet or not available. Make sure you have loaded or include related function.

    0 讨论(0)
  • 2020-11-29 10:56

    For those arriving here via a search engine (as I did), after reading the info (and linked bug-report) above, I started some code-prodding and ended up fixing my problems with 2 extra checks after loading the dom;

    $html = file_get_html('<your url here>');
    // first check if $html->find exists
    if (method_exists($html,"find")) {
         // then check if the html element exists to avoid trying to parse non-html
         if ($html->find('html')) {
              // and only then start searching (and manipulating) the dom 
         }
    }
    
    0 讨论(0)
  • 2020-11-29 10:59

    You just need to increase CONSTANT MAX_FILE_SIZE in file simple_html_dom.php.

    For example:

    define('MAX_FILE_SIZE', 999999999999999);
    
    0 讨论(0)
  • 2020-11-29 11:04

    The reason for this error is: the simple HTML DOM does not return the object if the size of the response from url is greater than 600000.
    You can void it by changing the simple_html_dom.php file. Remove strlen($contents) > MAX_FILE_SIZE from the if condition of the file_get_html function.
    This will solve your issue.

    0 讨论(0)
  • 2020-11-29 11:05

    I'm having the same error come up in my logs and apart from the solutions mentioned above, it could also be that there is no 'span' in the document. I get the same error when searching for divs with a particular class that doesn't exist on the page, but when searching for something that I know exists on the page, the error doesn't pop up.

    0 讨论(0)
  • 2020-11-29 11:07

    Before file_get_html/load_file method, you should first check if URL exists or not.

    If the URL exists, you pass one step.
    (Some servers, service a 404 page a valid HTML page. which has propriate HTML page structure like body, head, etc. But it has only text "This page couldn'!t find. 404 error bla bla..)

    If URL is 200-OK, then you should check whether fetched thing is object and whether nodes are set.

    That's the code i used in my pages.

    function url_exists($url){
        if ((strpos($url, "http")) === false) $url = "http://" . $url;
        $headers = @get_headers($url);
        // print_r($headers);
        if (is_array($headers)){
            if(strpos($headers[0], '404 Not Found'))
                return false;
            else
                return true;    
        }         
        else
            return false;
    }
    
    $pageAddress='http://www.google.com';
    if ( url_exists($pageAddress) ) {
        $htmlPage->load_file( $pageAddress );
    } else {
        echo 'url doesn t exist, i stop';
        return;
    }
    
    if( $htmlPage && is_object($htmlPage) && isset($htmlPage->nodes) )
    {
        // do your work here...
    } else {
        echo 'fetched page is not ok, i stop';
        return;
    }
    
    0 讨论(0)
提交回复
热议问题