问题
I have HTML like this
<li class="in-ttl-b">(a) kanji; a Chinese character [ideograph]
<ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">漢字で書く</span></li><li class="text-jeen text-c">write in <i>kanji</i> [<i>Chinese characters</i>]</li></ul>
<ul class="list-data-b-in"><li class="text-jejp text-c"><span class="ex">常用漢字</span></li><li class="text-jeen text-c"><i>Chinese characters</i> for everyday use (in Japan)</li></ul>
</li>
How can I get only kanji; a Chinese character [ideograph]
?
回答1:
You can get that by selecting the first text node that is child of the outer li
element. For example, assuming there can be more than one instance of li
with class="in-ttl-b"
:
Dim lis = HTMLDoc.DocumentNode.SelectNodes("//li[@class='in-ttl-b']")
For Each li As HtmlNode in lis
'select the first text node in <li> :
Dim txt = li.SelectSingleNode("text()[1]")
Console.WriteLine(li.InnerText)
Next
dotnetfiddle demo
output :
(a) kanji; a Chinese character [ideograph]
来源:https://stackoverflow.com/questions/39569210/retrieve-parts-of-text-inside-li