select an xml element, ignore element name, print newline

问题

I'd like to select the first element, but ignore its name in the output.

This is what I'm getting, after requesting the first url element from each input xml file:

% xmllint \
 --xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
 *.response.ya.xml
<url>https://example.com/</url><url>https://example.net/</url><url>https://example.org/</url>

But this is what I want instead:

https://example.com/
https://example.net/
https://example.org/

Note that the idea is to select the value of the first <url> element from each input Yandex.XML (Я Feel Lucky).

How do I do that with xpath?

回答1:

I ended up using awk to remove <url> and </url>, and print the text from each element on a separate line, ignoring all the empty lines:

xmllint \
--xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
| awk -F'</?url>' '{for(i=2;i<=NF;i++) if ($i != "") print $i}'

回答2:

Try instead:

//yandexsearch/response/results/grouping/group/doc[1])/url/text()

XPath normally only selects nodes, and you would do concatenation in the code surrounding the xpath extraction.

That being said, XPath 2.0 can, if that's available to you:

string-join(//yandexsearch/response/results/grouping/group/doc[1])/url/text(), ' \n')

Also, this answer provides a couple of XSLT-based solutions.

来源：https://stackoverflow.com/questions/21053097/select-an-xml-element-ignore-element-name-print-newline

标签

xml

xpath

xmllint

yandex-api

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!