问题
I'm trying to scrape the price field from this website using the HTML Agility Pack.
My code is as follows;
var web = new HtmlWeb();
var doc = web.Load(String.Format(overClockersURL, componentID));
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
I obtained the XPath query by using Firebug's "Copy as XPath" feature.
The problem I'm having is that SelectSingleNode is returning null - it doesn't seem to find the element specified by the query. I'm a bit stumped as to why, but I don't have much experience with XPath, so would appreciate some pointers as to what I've done wrong.
回答1:
When that happens, you should check if the page is being loaded correctly (you said you're through a HTTP Proxy?)
Try writing the content of doc.DocumentNode.OuterHtml
to a text file so you can see if the page is being loaded correctly. Maybe you're getting an error page instead of the original page.
回答2:
If I run this code:
var web = new HtmlWeb();
var doc = web.Load("http://www.overclockers.co.uk/showproduct.php?prodid=GX-033-HS");
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
Console.WriteLine("price=" + priceContent.InnerHtml);
It outputs:
price=529.99
So it seems to be working. You can also use //span[@id=\"prodprice\"]"
which is better as it avoids all non SPAN tags.
来源:https://stackoverflow.com/questions/5980670/xpath-query-problem-using-html-agility-pack