问题
I am trying to easily access values from an xml file.
<artikelen>
<artikel nummer="121">
<code>ABC123</code>
<naam>Highlight pen</naam>
<voorraad>231</voorraad>
<prijs>0.56</prijs>
</artikel>
<artikel nummer="123">
<code>PQR678</code>
<naam>Nietmachine</naam>
<voorraad>587</voorraad>
<prijs>9.99</prijs>
</artikel>
..... etc
If i want to acces the value ABC123, how do I get it?
import xmltodict
with open('8_1.html') as fd:
doc = xmltodict.parse(fd.read())
print(doc[fd]['code'])
回答1:
Using your example:
import xmltodict
with open('artikelen.xml') as fd:
doc = xmltodict.parse(fd.read())
If you examine doc
, you'll see it's an OrderedDict
, ordered by tag:
>>> doc
OrderedDict([('artikelen',
OrderedDict([('artikel',
[OrderedDict([('@nummer', '121'),
('code', 'ABC123'),
('naam', 'Highlight pen'),
('voorraad', '231'),
('prijs', '0.56')]),
OrderedDict([('@nummer', '123'),
('code', 'PQR678'),
('naam', 'Nietmachine'),
('voorraad', '587'),
('prijs', '9.99')])])]))])
The root node is called artikelen
, and there a subnode artikel
which is a list of OrderedDict
objects, so if you want the code
for every article, you would do:
codes = []
for artikel in doc['artikelen']['artikel']:
codes.append(artikel['code'])
# >>> codes
# ['ABC123', 'PQR678']
If you specifically want the code
only when nummer
is 121
, you could do this:
code = None
for artikel in doc['artikelen']['artikel']:
if artikel['@nummer'] == '121':
code = artikel['code']
break
That said, if you're parsing XML documents and want to search for a specific value like that, I would consider using XPath expressions, which are supported by ElementTree
.
回答2:
This is using xml.etree You can try this:
for artikelobj in root.findall('artikel'):
print artikelobj.find('code')
if you want to extract a specific code based on the attribute 'nummer' of artikel, then you can try this:
for artikelobj in root.findall('artikel'):
if artikel.get('nummer') == 121:
print artikelobj.find('code')
this will print only the code you want.
回答3:
To read .xml files :
import lxml.etree as ET
root = ET.parse(filename).getroot()
value = root.node1.node2.variable_name.text
回答4:
You can use lxml package using XPath Expression.
from lxml import etree
f = open("8_1.html", "r")
tree = etree.parse(f)
expression = "/artikelen/artikel[1]/code"
l = tree.xpath(expression)
code = next(i.text for i in l)
print code
# ABC123
The thing to notice here is the expression. /artikelen
is the root element. /artikel[1]
chooses the first artikel
element under root
(Notice first element is not at index 0). /code
is the child element under artikel[1]
. You can read more about at lxml and xpath syntax.
来源:https://stackoverflow.com/questions/40154727/how-to-use-xmltodict-to-get-items-out-of-an-xml-file