问题
I'm working on parsing an XML-Sheet in Python. The XML has a structure like this:
<layer1>
<layer2>
<element>
<info1></info1>
</element>
<element>
<info1></info1>
</element>
<element>
<info1></info1>
</element>
</layer2>
</layer1>
Without layer2, I have no problems to acess the data in info1. But with layer2, I'm really in trouble. Their I can adress info1 with: root.firstChild.childNodes[0].childNodes[0].data
So my thought was, that I can do it similiar like this:root.firstChild.firstChild.childNodes[0].childNodes[0].data
So this is how I solved my problem: from xml.etree import cElementTree as ET
from xml.etree import cElementTree as ET
tree = ET.parse("test.xml")
root = tree.getroot()
for elem in root.findall('./layer2/'):
for node in elem.findall('element/'):
x = node.find('info1').text
if x != "abc":
elem.remove(node)
回答1:
Don't use the minidom
API if you can help it. Use the ElementTree API instead; the xml.dom.minidom documentation explicitly states that:
Users who are not already proficient with the DOM should consider using the
xml.etree.ElementTree
module for their XML processing instead.
Here is a short sample that uses the ElementTree
API to access your elements:
from xml.etree import ElementTree as ET
tree = ET.parse('inputfile.xml')
for info in tree.findall('.//element/info1'):
print info.text
This uses an XPath expression to list all info1
elements that are contained inside a element
element, regardless of their position in the overall XML document.
If all you need is the first info1
element, use .find()
:
print tree.find('.//info1').text
With the DOM
API, .firstChild
could easily be a Text
node instead of an Element
node; you always need to loop over the .childNotes
sequence to find the first Element
match:
def findFirstElement(node):
for child in node.childNodes:
if child.nodeType == node.ELEMENT_NODE:
return child
but for your case, perhaps using .getElementsByTagName() suffices:
root.getElementsByTagName('info1').data
回答2:
does this work? (im not amazing at python just a quick thought)
name[0].firstChild.nodeValue
来源:https://stackoverflow.com/questions/16196501/python-minidom-how-to-access-an-element