HtmlAgilityPack - Grab data from html table

前端 未结 2 995
别跟我提以往
别跟我提以往 2021-01-22 20:55

My program uses HtmlAgilityPack and grabs a HTML web page, stores it in a variable and I\'m trying to get from the HTML two tables which are under specific Div Class tags (board

相关标签:
2条回答
  • 2021-01-22 21:26

    Try:

    foreach (HtmlNode table in 
             htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table"))
    

    It's an XPath expression matching the attribute. See here for more info:

    http://www.exampledepot.com/egs/org.w3c.dom/xpath_getelembyattr.html

    0 讨论(0)
  • 2021-01-22 21:34

    The following XPATH allows you to search for a specific DIV (with the class 'boardcontainer') within your HTML document:

    //div[@class='boardcontainer']/table
    

    To handle empty rows, simply check whether or not the returned HtmlNodeCollection is null.

    Here is a complete example:

    HtmlDocument htmlDoc = new HtmlDocument();
    htmlDoc.LoadHtml(html);
    
    foreach (HtmlNode table in htmlDoc.DocumentNode.SelectNodes("//div[@class='boardcontainer']/table"))
    {
      Console.WriteLine("Found: " + table.Id);
    
      foreach (HtmlNode row in table.SelectNodes("tr"))
      {
        Console.WriteLine("row");
    
        HtmlNodeCollection cells = row.SelectNodes("th|td");
    
        if (cells == null)
        {
          continue;
        }
    
        foreach (HtmlNode cell in cells)
        {                        
          Console.WriteLine("cell: " + cell.InnerText);
        }
      }
    } 
    

    You should also check if a table is found and if the found table contains rows at all.

    0 讨论(0)
提交回复
热议问题