XPath Query Problem using HTML Agility Pack

末鹿安然 提交于 2020-01-04 13:30:14

问题


I'm trying to scrape the price field from this website using the HTML Agility Pack.

My code is as follows;

var web = new HtmlWeb();
var doc = web.Load(String.Format(overClockersURL, componentID));
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");

I obtained the XPath query by using Firebug's "Copy as XPath" feature.

The problem I'm having is that SelectSingleNode is returning null - it doesn't seem to find the element specified by the query. I'm a bit stumped as to why, but I don't have much experience with XPath, so would appreciate some pointers as to what I've done wrong.


回答1:


When that happens, you should check if the page is being loaded correctly (you said you're through a HTTP Proxy?)

Try writing the content of doc.DocumentNode.OuterHtml to a text file so you can see if the page is being loaded correctly. Maybe you're getting an error page instead of the original page.




回答2:


If I run this code:

    var web = new HtmlWeb();
    var doc = web.Load("http://www.overclockers.co.uk/showproduct.php?prodid=GX-033-HS");
    var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
    Console.WriteLine("price=" + priceContent.InnerHtml);

It outputs:

price=529.99

So it seems to be working. You can also use //span[@id=\"prodprice\"]" which is better as it avoids all non SPAN tags.



来源:https://stackoverflow.com/questions/5980670/xpath-query-problem-using-html-agility-pack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!