C#, Html Agility, Selecting every paragraph within a div tag

纵然是瞬间 提交于 2019-12-19 03:42:14

问题


How can I select every paragraph in a div tag for example.

<div id="body_text">
<p>Hi</p>
<p>Help Me Please</P>
<p>Thankyou</P>

I have got Html Agility downloaded and referenced in my program, All I need is the paragraphs. There may be a variable number of paragraphs and there are loads of different div tags but I only need the content within the body_text. Then I assume this can be stored as a string which I then want to write to a .txt file for later reference. Thankyou.


回答1:


The valid XPATH for your case is //div[@id='body_text']/p

foreach(HtmlNode node in yourHTMLAgilityPackDocument.DocumentNode.SelectNodes("//div[@id='body_text']/p")
{
  string text = node.InnerText; //that's the text you are looking for
}



回答2:


Here's a solution that grabs the paragraphs as an enumeration of HtmlNodes:

HtmlDocument doc = new HtmlDocument();
doc.Load("your.html");
var div = doc.GetElementbyId("body_text");
var paragraphs = div.ChildNodes.Where(item => item.Name == "p"); 

Without explicit Linq:

var paragraphs = doc.GetElementbyId("body_text").Elements("p");  


来源:https://stackoverflow.com/questions/4737757/c-html-agility-selecting-every-paragraph-within-a-div-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!