问题
for this xml
<Departments orgID="123" name="xmllist">
<Department>
<orgID>124</orgID>
<name>A</name>
<type>type a</type>
<status>Active</status>
<Department>
<orgID>125</orgID>
<name>B</name>
<type>type b</type>
<status>Active</status>
<Department>
<orgID>126</orgID>
<name>C</name>
<type>type c</type>
<status>Active</status>
</Department>
</Department>
</Department>
<Department>
<orgID>109449</orgID>
<name>D</name>
<type>type d</type>
<status>Active</status>
</Department>
</Departments>
How i can get all parents of a node using lxml
etree
in python.
Expected output : Input orgid=126 , it will return all the parents like ,
{'A':124,'B':125,'C':126}
回答1:
Using lxml
and XPath:
>>> s = '''
... <Departments orgID="123" name="xmllist">
... <Department>
... <orgID>124</orgID>
... <name>A</name>
... <type>type a</type>
... <status>Active</status>
... <Department>
... <orgID>125</orgID>
... <name>B</name>
... <type>type b</type>
... <status>Active</status>
... <Department>
... <orgID>126</orgID>
... <name>C</name>
... <type>type c</type>
... <status>Active</status>
... </Department>
... </Department>
... </Department>
... <Department>
... <orgID>109449</orgID>
... <name>D</name>
... <type>type d</type>
... <status>Active</status>
... </Department>
... </Departments>
... '''
Using ancestor-or-self
axis, you can find the node itself, parent, grandparent, ...
>>> import lxml.etree as ET
>>> root = ET.fromstring(s)
>>> for target in root.xpath('.//Department/orgID[text()="126"]'):
... d = {
... dept.find('name').text: int(dept.find('orgID').text)
... for dept in target.xpath('ancestor-or-self::Department')
... }
... print(d)
...
{'A': 124, 'C': 126, 'B': 125}
回答2:
Use lxml's iterancestors()
method.
from lxml import etree
doc = etree.fromstring(xml)
rval = {}
for org in doc.xpath('//orgID[text()="126"]'):
for ancestor in org.iterancestors('Department'):
id=ancestor.find('./orgID').text
name=ancestor.find('./name').text
rval[name]=id
print rval
output:
{'A': '124', 'C': '126', 'B': '125'}
If you're actually trying to preserve the order of the elements then you can't use a dict because you can't control the key order in a dict. You'll have to use an OrderedDict or just and array of tuples:
doc = etree.fromstring(xml)
a = []
for org in doc.xpath('//orgID[text()="126"]'):
for ancestor in org.iterancestors():
if ancestor.find('./orgID') is not None:
id=ancestor.find('./orgID').text
name=ancestor.find('./name').text
elif ancestor.get('orgID'):
id=ancestor.get('orgID')
name=ancestor.get('name')
else:
continue
print id,name
a.append((name,id))
print "In order of discovery:\n ", a
print "From root to child\n ", [x for x in reversed(a)]
print "dict keys are not sorted\n ", dict(a)
Output:
126 C
125 B
124 A
123 xmllist
In order of discovery:
[('C', '126'), ('B', '125'), ('A', '124'), ('xmllist', '123')]
From root to child
[('xmllist', '123'), ('A', '124'), ('B', '125'), ('C', '126')]
dict keys are not sorted
{'A': '124', 'xmllist': '123', 'C': '126', 'B': '125'}
来源:https://stackoverflow.com/questions/21746525/get-all-parents-of-xml-node-using-python