Reading XML file and fetching its attributes value in Python

后端 未结 7 647
情歌与酒
情歌与酒 2020-12-03 15:56

I have this XML file:


  virtual bug
  66523dfdf555dfd
  
          


        
相关标签:
7条回答
  • 2020-12-03 16:33

    XML

    <data>
        <items>
            <item name="item1">item1</item>
            <item name="item2">item2</item>
            <item name="item3">item3</item>
            <item name="item4">item4</item>
        </items>
    </data>
    

    Python :

    from xml.dom import minidom
    xmldoc = minidom.parse('items.xml')
    itemlist = xmldoc.getElementsByTagName('item') 
    print "Len : ", len(itemlist)
    print "Attribute Name : ", itemlist[0].attributes['name'].value
    print "Text : ", itemlist[0].firstChild.nodeValue
    for s in itemlist :
        print "Attribute Name : ", s.attributes['name'].value
        print "Text : ", s.firstChild.nodeValue
    
    0 讨论(0)
  • 2020-12-03 16:34

    Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):

    from lxml import etree
    doc = etree.parse(filename)
    
    memoryElem = doc.find('memory')
    print memoryElem.text        # element text
    print memoryElem.get('unit') # attribute
    

    You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:

    import xml.dom.minidom as minidom
    doc = minidom.parse(filename)
    
    memoryElem = doc.getElementsByTagName('memory')[0]
    print ''.join( [node.data for node in memoryElem.childNodes] )
    print memoryElem.getAttribute('unit')
    

    lxml seems like the winner to me.

    0 讨论(0)
  • 2020-12-03 16:38

    You can try parsing it with using (recover=True). you can do something like this.

    parser = etree.XMLParser(recover=True)
    tree = etree.parse('your xml file', parser)
    

    I used this recently and it worked for me, you can try and see but in case you need to do any more complecated xml data extractions, you can take a look at this code i wrote for some project handling complex xml data extractions.

    0 讨论(0)
  • 2020-12-03 16:40

    Above XML does not have closing tag, It will give

    etree parse error: Premature end of data in tag

    Correct XML is:

    <domain type='kmc' id='007'>
      <name>virtual bug</name>
      <uuid>66523dfdf555dfd</uuid>
      <os>
        <type arch='xintel' machine='ubuntu'>hvm</type>
        <boot dev='hd'/>
        <boot dev='cdrom'/>
      </os>
      <memory unit='KiB'>524288</memory>
      <currentMemory unit='KiB'>270336</currentMemory>
      <vcpu placement='static'>10</vcpu>
    </domain>
    
    0 讨论(0)
  • 2020-12-03 16:42

    Other people can tell you how to do it with the Python standard library. I'd recommend my own mini-library that makes this a completely straight forward.

    >>> obj = xml2obj.xml2obj("""<domain type='kmc' id='007'>
    ... <name>virtual bug</name>
    ... <uuid>66523dfdf555dfd</uuid>
    ... <os>
    ... <type arch='xintel' machine='ubuntu'>hvm</type>
    ... <boot dev='hd'/>
    ... <boot dev='cdrom'/>
    ... </os>
    ... <memory unit='KiB'>524288</memory>
    ... <currentMemory unit='KiB'>270336</currentMemory>
    ... <vcpu placement='static'>10</vcpu>
    ... </domain>""")
    >>> obj.uuid
    u'66523dfdf555dfd'
    

    http://code.activestate.com/recipes/534109-xml-to-python-data-structure/

    0 讨论(0)
  • 2020-12-03 16:42

    I would use lxml and parse it out using xpath //UUID

    0 讨论(0)
提交回复
热议问题