Get list of XML attribute values in Python

前端 未结 7 1467
有刺的猬
有刺的猬 2020-12-31 07:37

I need to get a list of attribute values from child elements in Python.

It\'s easiest to explain with an example.

Given some XML like this:

&         


        
相关标签:
7条回答
  • 2020-12-31 07:37

    Using a standard W3 DOM such as the stdlib's minidom, or pxdom:

    def getValues(category):
        for parent in document.getElementsByTagName('parent'):
            if parent.getAttribute('name')==category:
                return [
                    el.getAttribute('value')
                    for el in parent.getElementsByTagName('child')
                ]
        raise ValueError('parent not found')
    
    0 讨论(0)
  • 2020-12-31 07:43

    In Python 3.x, fetching a list of attributes is a simple task of using the member items()

    Using the ElementTree, below snippet shows a way to get the list of attributes. NOTE that this example doesn't consider namespaces, which if present, will need to be accounted for.

        import xml.etree.ElementTree as ET
    
        flName = 'test.xml'
        tree = ET.parse(flName)
        root = tree.getroot()
        for element in root.findall('<child-node-of-root>'):
            attrList = element.items()
            print(len(attrList), " : [", attrList, "]" )
    

    REFERENCE:

    Element.items()
    Returns the element attributes as a sequence of (name, value) pairs.
    The attributes are returned in an arbitrary order.

    Python manual

    0 讨论(0)
  • 2020-12-31 07:46

    I must admit I'm a fan of xmltramp due to its ease of use.

    Accessing the above becomes:

      import xmltramp
    
      values = xmltramp.parse('''...''')
    
      def getValues( values, category ):
        cat = [ parent for parent in values['parent':] if parent(name) == category ]
        cat_values = [ child(value) for child in parent['child':] for parent in cat ]
        return cat_values
    
      getValues( values, "CategoryA" )
      getValues( values, "CategoryB" )
    
    0 讨论(0)
  • 2020-12-31 07:47

    ElementTree 1.3 (unfortunately not 1.2 which is the one included with Python) supports XPath like this:

    import elementtree.ElementTree as xml
    
    def getValues(tree, category):
        parent = tree.find(".//parent[@name='%s']" % category)
        return [child.get('value') for child in parent]
    

    Then you can do

    >>> tree = xml.parse('data.xml')
    >>> getValues(tree, 'CategoryA')
    ['a1', 'a2', 'a3']
    >>> getValues(tree, 'CategoryB')
    ['b1', 'b2', 'b3']
    

    lxml.etree (which also provides the ElementTree interface) will also work in the same way.

    0 讨论(0)
  • 2020-12-31 07:50

    You can do this with BeautifulSoup

    >>> from BeautifulSoup import BeautifulStoneSoup
    >>> soup = BeautifulStoneSoup(xml)
    >>> def getValues(name):
    . . .      return [child['value'] for child in soup.find('parent', attrs={'name': name}).findAll('child')]
    

    If you're doing work with HTML/XML I would recommend you take a look at BeautifulSoup. It's similar to the DOM tree but contains more functionality.

    0 讨论(0)
  • 2020-12-31 07:57

    My preferred python xml library is lxml , which wraps libxml2.
    Xpath does seem the way to go here, so I'd write this as something like:

    from lxml import etree
    
    def getValues(xml, category):
        return [x.attrib['value'] for x in 
                xml.findall('/parent[@name="%s"]/*' % category)]
    
    xml = etree.parse(open('filename.xml'))
    
    >>> print getValues(xml, 'CategoryA')
    ['a1', 'a2', 'a3']
    >>> print getValues(xml, 'CategoryB')
    ['b1', 'b2', 'b3]
    
    0 讨论(0)
提交回复
热议问题