Retrieve parts of text inside

Retrieve parts of text inside <li>

问题

I have HTML like this

<li class="in-ttl-b">(a) kanji; a Chinese character [ideograph]
    <ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">漢字で書く</span></li><li class="text-jeen text-c">write in <i>kanji</i> [<i>Chinese characters</i>]</li></ul>
    <ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">常用漢字</span></li><li class="text-jeen text-c"><i>Chinese characters</i> for everyday use (in Japan)</li></ul>
</li>

How can I get only kanji; a Chinese character [ideograph]?

回答1:

You can get that by selecting the first text node that is child of the outer li element. For example, assuming there can be more than one instance of li with class="in-ttl-b" :

Dim lis = HTMLDoc.DocumentNode.SelectNodes("//li[@class='in-ttl-b']")
For Each li As HtmlNode in lis 
    'select the first text node in <li> :
    Dim txt = li.SelectSingleNode("text()[1]")
    Console.WriteLine(li.InnerText)
Next

dotnetfiddle demo

output :

(a) kanji; a Chinese character [ideograph]

来源：https://stackoverflow.com/questions/39569210/retrieve-parts-of-text-inside-li

标签

html

vb.net

html-agility-pack

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!