select an xml element, ignore element name, print newline

人走茶凉 提交于 2019-12-07 15:32:39

问题


I'd like to select the first element, but ignore its name in the output.

This is what I'm getting, after requesting the first url element from each input xml file:

% xmllint \
 --xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
 *.response.ya.xml
<url>https://example.com/</url><url>https://example.net/</url><url>https://example.org/</url>

But this is what I want instead:

https://example.com/
https://example.net/
https://example.org/

Note that the idea is to select the value of the first <url> element from each input Yandex.XML (Я Feel Lucky).

How do I do that with xpath?


回答1:


I ended up using awk to remove <url> and </url>, and print the text from each element on a separate line, ignoring all the empty lines:

xmllint \
--xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
| awk -F'</?url>' '{for(i=2;i<=NF;i++) if ($i != "") print $i}'



回答2:


Try instead:

//yandexsearch/response/results/grouping/group/doc[1])/url/text()

XPath normally only selects nodes, and you would do concatenation in the code surrounding the xpath extraction.

That being said, XPath 2.0 can, if that's available to you:

string-join(//yandexsearch/response/results/grouping/group/doc[1])/url/text(), ' \n')

Also, this answer provides a couple of XSLT-based solutions.



来源:https://stackoverflow.com/questions/21053097/select-an-xml-element-ignore-element-name-print-newline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!