问题
I have an HTML document and I'm getting elements based on a class. Once I have them, I'm going through each element and get further elements:
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(content);
var rows = doc.DocumentNode.SelectNodes("//tr[contains(@class, 'row')]");
foreach (var row in rows)
{
var name = row.SelectSingleNode("//span[contains(@class, 'name')]").InnerText,
var surname = row.SelectSingleNode("//span[contains(@class, 'surname')]").InnerText,
customers.Add(new Customer(name, surname));
};
However, the above is iterating through the rows but the always retrieving the text of the first row.
Is the XPath wrong?
回答1:
This is a FAQ in XPath. Whenever your XPath starts with /
, it ignores context element (the element referenced by row
variable in this case). It searches for matching elements starting from the root document node regardless of the context. That's why your SelectSingleNode()
always return the same element which is the first matched element in the entire document.
You only need to prepend a dot (.
) to make it relative to current context element :
foreach (var row in rows)
{
var name = row.SelectSingleNode(".//span[contains(@class, 'name')]").InnerText,
var surname = row.SelectSingleNode(".//span[contains(@class, 'surname')]").InnerText,
customers.Add(new Customer(name, surname));
}
回答2:
What about using LINQ?
var customers = rows.Select(row => new Customer(Name = row.SelectSingleNode("//span[contains(@class, 'name')]").InnerText, Surname = row.SelectSingleNode("//span[contains(@class, 'surname')]").InnerText)).ToList();
来源:https://stackoverflow.com/questions/39675240/foreach-not-iterating-through-elements