i am trying to parse an xml like
XBV
ElementTree's XPath support is limited. Use other library like lxml:
import lxml.etree
root = lxml.etree.parse('test.xml')
path="./pages/page/paragraph[text()='GHF']"
print root.xpath(path)
As @falsetru mentioned, ElementTree
doesn't support text()
predicate, but it supports matching child element by text, so in this example, it is possible to search for a page
that has a paragraph
with specific text, using the path ./pages/page[paragraph='GHF']
. The problem here is that there are multiple paragraph
tags in a page
, so one would have to iterate for the specific paragraph
. In my case, I needed to find the version
of a dependency
in a maven pom.xml, and there is only a single version
child so the following worked:
In [1]: import xml.etree.ElementTree as ET
In [2] ns = {"pom": "http://maven.apache.org/POM/4.0.0"}
In [3] print ET.parse("pom.xml").findall(".//pom:dependencies/pom:dependency[pom:artifactId='some-artifact-with-hardcoded-version']/pom:version", ns)[0].text
Out[1]: '1.2.3'