htmlagilitypack: Find second table within a div

别来无恙 提交于 2019-12-13 01:16:49

问题


I'm trying to parse information from a div that has 3 tables within it. I can get information from the first one without problem.

Code so far as follow:

HtmlAgilityPack.HtmlWeb doc = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument htmldocObject = doc.Load(URL);
var res = htmldocObject.DocumentNode.SelectSingleNode("//div[@class='BoxContent']");

var firstTable = res.SelectSingleNode("//table");
var charName = firstTable.ChildNodes[i++].InnerText.Substring(5).Trim();

<div class="BoxContent">
    <table>
        <tr bgcolor=#505050>
            <td colspan=2 class=white>
            <b>I'm getting this text</b>
            </td>
        </tr>
        <tr bgcolor=#F1E0C6>
            <td>I get this too</td>
            <td>I'm getting this as well</td>
        </tr>
    </table>
    <table>
        <tr>
            <td>Trying to retrieve this</td>
        </tr>
    </table>
</div>

How can I find the second table information with HAP?

I've read some about nextsibling function but I can't get it to work.


回答1:


var secondTable = res.SelectSingleNode("//table[2]");



回答2:


You can iterate through the collection of table inside the div this way

foreach(HtmlNode table in doc.res.SelectNodes("//table"])
{
  if(table != null)
  {
    var charName = table.InnerText.Substring(5).Trim();
  }
}



回答3:


You could try going directly for the <td> tags instead by changing your Xpath string.

HtmlNodeCollection tdNodeCollection = htmldocObject
                                     .DocumentNode
                                     .SelectNodes("//div[@class = 'BoxContent']//td");

foreach (HtmlNode tdNode in tdNodeCollection)
{
     Console.WriteLine(tdNode.InnerText);
}


来源:https://stackoverflow.com/questions/11675630/htmlagilitypack-find-second-table-within-a-div

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!