问题
I'd like to select the first element, but ignore its name in the output.
This is what I'm getting, after requesting the first url
element from each input xml file:
% xmllint \
--xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
*.response.ya.xml
<url>https://example.com/</url><url>https://example.net/</url><url>https://example.org/</url>
But this is what I want instead:
https://example.com/
https://example.net/
https://example.org/
Note that the idea is to select the value of the first <url>
element from each input Yandex.XML (Я Feel Lucky).
How do I do that with xpath?
回答1:
I ended up using awk
to remove <url>
and </url>
, and print the text from each element on a separate line, ignoring all the empty lines:
xmllint \
--xpath '(//yandexsearch/response/results/grouping/group/doc/url)[1]' \
| awk -F'</?url>' '{for(i=2;i<=NF;i++) if ($i != "") print $i}'
回答2:
Try instead:
//yandexsearch/response/results/grouping/group/doc[1])/url/text()
XPath normally only selects nodes, and you would do concatenation in the code surrounding the xpath extraction.
That being said, XPath 2.0 can, if that's available to you:
string-join(//yandexsearch/response/results/grouping/group/doc[1])/url/text(), ' \n')
Also, this answer provides a couple of XSLT-based solutions.
来源:https://stackoverflow.com/questions/21053097/select-an-xml-element-ignore-element-name-print-newline