问题
I am having a couple of issues with my code, I am trying to pull every paragraph from a page, but at the moment it is only selecting the last paragraph.
here is my code.
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@id='body']/p"))
{
string text = node.InnerText;
lblTest2.Text = text;
}
回答1:
In your loop you are taking the current node innerText and assigning it to the label. You do this to each node, so of course you only see the last one - you are not preserving the previous ones.
Try this:
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@id='body']/p"))
{
string text = node.InnerText;
lblTest2.Text += text + Environment.NewLine;
}
回答2:
IMO, XPath is no fun. I'd recommend using LINQ syntax instead:
foreach (var node in doc.DocumentNode
.DescendantNodes()
.Single(x => x.Id == "body")
.DescendantNodes()
.Where(x => x.Name == "p"))
{
string text = node.InnerText;
lblTest2.Text = text;
}
来源:https://stackoverflow.com/questions/4752840/html-agility-pack-c-sharp-paragraph-parsing-problem