How can I convert XML into a Python object?

后端 未结 7 1739
醉话见心
醉话见心 2020-12-02 10:46

I need to load an XML file and convert the contents into an object-oriented Python structure. I want to take this:

相关标签:
7条回答
  • 2020-12-02 10:59

    There are three common XML parsers for python: xml.dom.minidom, elementree, and BeautifulSoup.

    IMO, BeautifulSoup is by far the best.

    http://www.crummy.com/software/BeautifulSoup/

    0 讨论(0)
  • 2020-12-02 11:00

    How about this

    http://evanjones.ca/software/simplexmlparse.html

    0 讨论(0)
  • 2020-12-02 11:04

    It's worth looking at lxml.objectify.

    xml = """<main>
    <object1 attr="name">content</object1>
    <object1 attr="foo">contenbar</object1>
    <test>me</test>
    </main>"""
    
    from lxml import objectify
    
    main = objectify.fromstring(xml)
    main.object1[0]             # content
    main.object1[1]             # contenbar
    main.object1[0].get("attr") # name
    main.test                   # me
    

    Or the other way around to build xml structures:

    item = objectify.Element("item")
    item.title = "Best of python"
    item.price = 17.98
    item.price.set("currency", "EUR")
    
    order = objectify.Element("order")
    order.append(item)
    order.item.quantity = 3
    order.price = sum(item.price * item.quantity for item in order.item)
    
    import lxml.etree
    print(lxml.etree.tostring(order, pretty_print=True))
    

    Output:

    <order>
      <item>
        <title>Best of python</title>
        <price currency="EUR">17.98</price>
        <quantity>3</quantity>
      </item>
      <price>53.94</price>
    </order>
    
    0 讨论(0)
  • 2020-12-02 11:13

    David Mertz's gnosis.xml.objectify would seem to do this for you. Documentation's a bit hard to come by, but there are a few IBM articles on it, including this one (text only version).

    from gnosis.xml import objectify
    
    xml = "<root><nodes><node>node 1</node><node>node 2</node></nodes></root>"
    root = objectify.make_instance(xml)
    
    print root.nodes.node[0].PCDATA # node 1
    print root.nodes.node[1].PCDATA # node 2
    

    Creating xml from objects in this way is a different matter, though.

    0 讨论(0)
  • 2020-12-02 11:13
    #@Stephen: 
    #"can't hardcode the element names, so I need to collect them 
    #at parse and use them somehow as the object names."
    
    #I don't think thats possible. Instead you can do this. 
    #this will help you getting any object with a required name.
    
    import BeautifulSoup
    
    
    class Coll(object):
        """A class which can hold your Foo clas objects 
        and retrieve them easily when you want
        abstracting the storage and retrieval logic
        """
        def __init__(self):
            self.foos={}        
    
        def add(self, fooobj):
            self.foos[fooobj.name]=fooobj
    
        def get(self, name):
            return self.foos[name]
    
    class Foo(object):
        """The required class
        """
        def __init__(self, name, attr1=None, attr2=None):
            self.name=name
            self.attr1=attr1
            self.attr2=attr2
    
    s="""<main>
             <object name="somename">
                 <attr name="attr1">value1</attr>
                 <attr name="attr2">value2</attr>
             </object>
             <object name="someothername">
                 <attr name="attr1">value3</attr>
                 <attr name="attr2">value4</attr>
             </object>
         </main>
    """
    

    #

    soup=BeautifulSoup.BeautifulSoup(s)
    
    
    bars=Coll()
    for each in soup.findAll('object'):
        bar=Foo(each['name'])
        attrs=each.findAll('attr')
        for attr in attrs:
            setattr(bar, attr['name'], attr.renderContents())
        bars.add(bar)
    
    
    #retrieve objects by name
    print bars.get('somename').__dict__
    
    print '\n\n', bars.get('someothername').__dict__
    

    output

    {'attr2': 'value2', 'name': u'somename', 'attr1': 'value1'}
    
    
    {'attr2': 'value4', 'name': u'someothername', 'attr1': 'value3'}
    
    0 讨论(0)
  • 2020-12-02 11:17

    If googling around for a code-generator doesn't work, you could write your own that uses XML as input and outputs objects in your language of choice.

    It's not terribly difficult, however the three step process of Parse XML, Generate Code, Compile/Execute Script does making debugging a bit harder.

    0 讨论(0)
提交回复
热议问题