Dom and XPath scraping - What wrong here?

前端 未结 2 620
不思量自难忘°
不思量自难忘° 2021-01-27 05:46

I need to scrape a length of text from a webpage from the internet, I am using the dom and xpath to find the data, however I cant seem to select the exact information I need. He

相关标签:
2条回答
  • 2021-01-27 06:23

    Your XPath is fine when I use it in Firefox, but it won't work with DOM, which is not surprising. I assume you got your XPath from some sort of browser plugin able to return the path for certain elements. However, you should not trust XPaths returned by browser plugins because browsers will modify the DOM through JavaScript and add implied values where necessary. Use the raw sourcecode instead.

    Your XPath evaluates to "Home delivery within 2 days" in Firefox, which is not what I would expect in a variable called "stock_data". But anyway, this should do it:

    $dom = new DOMDocument;
    libxml_use_internal_errors(TRUE);
    $dom->loadHTMLFile('http://www.argos.co.uk/static/Product/partNumber/9282197/Trail/searchtext%3EIPOD+TOUCH.htm');
    libxml_clear_errors();
    
    $xpath = new DOMXpath($dom);
    $nodes = $xpath->query(
        '/html/body//div[@id="deliveryInformation"]/ul/li[@class="home"]/span'
    );
    echo $nodes->item(0)->nodeValue; // "Home delivery within 2 days"
    
    0 讨论(0)
  • 2021-01-27 06:30

    Running your code, I first get :

    Notice: Undefined variable: expr_argos
    Warning: DOMXPath::query() [domxpath.query]: Invalid expression
    

    So, first of all, make sure you are using something valid for your XPath query -- for example, you should have this :

    $nodes_argos = $xpath_argos->query($expr_currys);
    

    instead of what you currently have :

    $nodes_argos = $xpath_argos->query($expr_argos);
    


    Then, you get the following error :

    Notice: Trying to get property of non-object
    

    on the following line :

    $argos_stock_data = $nodes_argos->item(0)->nodeValue;
    

    Basically, this means you are trying to read a property, nodeValue, on something that is not an object : $nodes_argos->item(0);

    I'm guessing your XPath query is not valid ; so, the call to the xpath() method doesn't return anything interesting.

    You should check your (quite a bit too long to be easy to understand) XPath query, making sure it matches something in your HTML page.

    0 讨论(0)
提交回复
热议问题