how to extract text outside tags xml

一笑奈何 提交于 2019-12-30 20:38:10

问题


I want to extract text outside tags. For example,

<body>
    This is an exmaple
    <p>
        blablabla
    </p>
    <references>
        refer 1
        refer 2
    </references>
</body>

I want to get the text "This is an example" only without text in other tags (p or reference). I tried several methods but does not work. Any1 can help? Big thanks.


回答1:


You must think a text inside a tag like a node. A text node is retrieved using the test node text(). Example. Given:

<body>
    This is an exmaple
    <p>
    blablabla
    <\p>
    <references>
        refer 1
        refer 2
    <\references>
    another example
<\body>

XPath:

"/body/text()"

Will retrieve all children text nodes of body, like "This is an exmaple" and "another example", while:

"/body/text()[1]"

will retrieve only the first one, "This is an exmaple". If you want all the descendant text nodes you can use:

"/body//text()"

or, you want all the text nodes inside first p:

"/body/p[1]//text()"



回答2:


Use this XPath: /body/text(). It will select This is an exmaple.



来源:https://stackoverflow.com/questions/6871273/how-to-extract-text-outside-tags-xml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!