问题
I am trying to scrape using the HtmlAgilityPack child elements from a list of divs. The most parent Div is //div[@class='cell in-area-cell middle-cell'] and if I simply iterate through the list I can display all the child content from each parent fine.
But I don't want to display all the content, I would like to pick certain div's, p's and a's from each of the children but the code below is only giving me a list of the first //a[@class='listing-name']. It gives me the correct number of lstRecords but they all have the same value.
Here is my code:
Model:
public class TempSearch
{
public string listing_name { get; set; }
}
View:
@model List<tempsearch.Models.TempSearch>
@foreach (var ps in Model)
{
<h4>@Html.Raw(ps.listing_name)</h4>
}
Control:
public ActionResult TempSearch()
{
string html = Server.MapPath("~/Content/tempsearch.html");
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(html);
List<TempSearch> lstRecords = new List<TempSearch>();
foreach (HtmlNode node in document.DocumentNode.SelectNodes("//div[@class='cell in-area-cell middle-cell']"))
{
TempSearch tempSearch = new TempSearch();
HtmlNode node2 = document.DocumentNode.SelectSingleNode("//a[@class='listing-name']");
tempSearch.listing_name += node2.InnerHtml.Trim();
lstRecords.Add(tempSearch);
}
return View(lstRecords);
}
I guess it has something to do with the way i'm populating the list?
回答1:
You want to use XPath relative to element currently referenced bynode
variable, like this :
HtmlNode node2 = node.SelectSingleNode(".//a[@class='listing-name']");
Notice the .
at the beginning of the XPath which indicate that the XPath is relative to current context element, and SelectSingleNode()
method called on node
variable to make node
as the current context element. Otherwise, you'll always get the same element over and over on each iteration.
来源:https://stackoverflow.com/questions/30960858/htmlagilitypack-select-individual-elements-from-a-list-of-divs