Html Agility Pack c# Paragraph parsing problem

后端 未结 2 923
广开言路
广开言路 2021-01-23 23:10

I am having a couple of issues with my code, I am trying to pull every paragraph from a page, but at the moment it is only selecting the last paragraph.

here is my code.

相关标签:
2条回答
  • 2021-01-23 23:13

    In your loop you are taking the current node innerText and assigning it to the label. You do this to each node, so of course you only see the last one - you are not preserving the previous ones.

    Try this:

    foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@id='body']/p"))
    {
      string text = node.InnerText;
      lblTest2.Text += text + Environment.NewLine;
    }
    
    0 讨论(0)
  • 2021-01-23 23:22

    IMO, XPath is no fun. I'd recommend using LINQ syntax instead:

    foreach (var node in doc.DocumentNode
        .DescendantNodes()
        .Single(x => x.Id == "body")
        .DescendantNodes()
        .Where(x => x.Name == "p")) 
    {
        string text = node.InnerText;
        lblTest2.Text = text;
    }
    
    0 讨论(0)
提交回复
热议问题