My program uses HtmlAgilityPack and grabs a HTML web page, stores it in a variable and I\'m trying to get from the HTML two tables which are under specific Div Class tags (board
Try:
foreach (HtmlNode table in
htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table"))
It's an XPath expression matching the attribute. See here for more info:
http://www.exampledepot.com/egs/org.w3c.dom/xpath_getelembyattr.html
The following XPATH allows you to search for a specific DIV
(with the class 'boardcontainer') within your HTML document:
//div[@class='boardcontainer']/table
To handle empty rows, simply check whether or not the returned HtmlNodeCollection
is null
.
Here is a complete example:
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table"))
{
Console.WriteLine("Found: " + table.Id);
foreach (HtmlNode row in table.SelectNodes("tr"))
{
Console.WriteLine("row");
HtmlNodeCollection cells = row.SelectNodes("th|td");
if (cells == null)
{
continue;
}
foreach (HtmlNode cell in cells)
{
Console.WriteLine("cell: " + cell.InnerText);
}
}
}
You should also check if a table is found and if the found table contains rows at all.