问题
How do I find an xml node by its name and get its value between the tags?
I'm doing that the following way:
from xml.dom import minidom
dom = minidom.parseString(ET.tostring(ET.fromstring(some_xml), "utf-8"))
self.a1 = dom.childNodes[0].childNodes[4].childNodes[0].nodeValue
self.a2 = dom.childNodes[0].childNodes[5].childNodes[0].nodeValue
I want to do that using the name of the tag instead of using its index in an array childNodes
. How?
update:
<ReconnectResponse xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://ccc.aaa.bbb/api/v1">
<ErrorMessage />
<ErrorCode>0</ErrorCode>
<ServerTime>aaa</ServerTime>
<OAuthToken>bbb</OAuthToken>
<OAuthTokenSecret>ccc</OAuthTokenSecret>
</ReconnectResponse>
and the code:
dom.getElementsByTagName("ServerTime") # => []
update2
dom.toxml()
u'<?xml version="1.0" ?><ns0:ReconnectResponse xmlns:ns0="http://ccc.aaa.bbb/api/v1">\n <ns0:ErrorMessage/>\n <ns0:ErrorCode>0</ns0:ErrorCode>\n <ns0:ServerTime>aaa</ns0:ServerTime>\n <ns0:OAuthToken>bbb</ns0:OAuthToken>\n <ns0:OAuthTokenSecret>ccc</ns0:OAuthTokenSecret>\n</ns0:ReconnectResponse>'
but how I get the value? I tried this:
dom.getElementsByTagName("ns0:OAuthToken")
[<DOM Element: ns0:OAuthToken at 0x10635a878>]
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0]
<DOM Element: ns0:OAuthToken at 0x10635a878>
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0].nodeValue
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0].toxml()
u'<ns0:OAuthToken>aaaaaa</ns0:OAuthToken>'
回答1:
You need to use getElementsByTagNameNS, because you don't have a tag named ServerTime
, you have one named {http://ccc.aaa.bbb/api/v1}ServerTime
(where {http://ccc.aaa.bbb/api/v1}
indicates the default namespace.)
getElementsByTagNameNS("http://ccc.aaa.bbb/api/v1", "ServerTime")
This namespace is implicitly added to every tag in your XML body, due to the last property of the document element:
<ReconnectResponse ... xmlns="http://ccc.aaa.bbb/api/v1">
回答2:
Usually using
lxml
and xpath is a common approach in Python.
As you want to use minidom
explicitly, you can use the following method to get all HTML elements of a particular tag.
matches = dom.getElementsByTagName("foo")
for e in matches:
print(e.firstChild.nodeValue)
来源:https://stackoverflow.com/questions/26522798/finding-an-xml-node-by-its-name-rather-than-by-its-index