Editing XML as a dictionary in python?

前端未结

关注

 8  1067

深忆病人

I\'m trying to generate customized xml files from a template xml file in python.

Conceptually, I want to read in the template xml, remove some elements, change some

相关标签:

8条回答

礼貌的吻别

2020-12-31 18:16

most direct way to me :

root        = ET.parse(xh)
data        = root.getroot()
xdic        = {}
if data > None:
    for part in data.getchildren():
        xdic[part.tag] = part.text

0 讨论(0)

陌清茗

2020-12-31 18:18

For easy manipulation of XML in python, I like the Beautiful Soup library. It works something like this:

Sample XML File:

<root>
  <level1>leaf1</level1>
  <level2>leaf2</level2>
</root>

Python code:

from BeautifulSoup import BeautifulStoneSoup, Tag, NavigableString

soup = BeautifulStoneSoup('config-template.xml') # get the parser for the xml file
soup.contents[0].name
# u'root'

You can use the node names as methods:

soup.root.contents[0].name
# u'level1'

It is also possible to use regexes:

import re
tags_starting_with_level = soup.findAll(re.compile('^level'))
for tag in tags_starting_with_level: print tag.name
# level1
# level2

Adding and inserting new nodes is pretty straightforward:

# build and insert a new level with a new leaf
level3 = Tag(soup, 'level3')
level3.insert(0, NavigableString('leaf3')
soup.root.insert(2, level3)

print soup.prettify()
# <root>
#  <level1>
#   leaf1
#  </level1>
#  <level2>
#   leaf2
#  </level2>
#  <level3>
#   leaf3
#  </level3>
# </root>

0 讨论(0)

[愿得一人]

2020-12-31 18:21
I'm not sure if converting the info set to nested dicts first is easier. Using ElementTree, you can do this:
```
import xml.etree.ElementTree as ET
doc = ET.parse("template.xml")
lvl1 = doc.findall("level1-name")[0]
lvl1.remove(lvl1.find("leaf1")
lvl1.remove(lvl1.find("leaf2")
# or use del lvl1[idx]
doc.write("config-new.xml")
```
ElementTree was designed so that you don't have to convert your XML trees to lists and attributes first, since it uses exactly that internally.

It also support as small subset of XPath.
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦谈多话

2020-12-31 18:27
Have you tried this?
```
print xml.etree.ElementTree.tostring( conf_new )
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

一生所求

2020-12-31 18:31

My modification of Daniel's answer, to give a marginally neater dictionary:

def xml_to_dictionary(element):
    l = len(namespace)
    dictionary={}
    tag = element.tag[l:]
    if element.text:
        if (element.text == ' '):
            dictionary[tag] = {}
        else:
            dictionary[tag] = element.text
    children = element.getchildren()
    if children:
        subdictionary = {}
        for child in children:
            for k,v in xml_to_dictionary(child).items():
                if k in subdictionary:
                    if ( isinstance(subdictionary[k], list)):
                        subdictionary[k].append(v)
                    else:
                        subdictionary[k] = [subdictionary[k], v]
                else:
                    subdictionary[k] = v
        if (dictionary[tag] == {}):
            dictionary[tag] = subdictionary
        else:
            dictionary[tag] = [dictionary[tag], subdictionary]
    if element.attrib:
        attribs = {}
        for k,v in element.attrib.items():
            attribs[k] = v
        if (dictionary[tag] == {}):
            dictionary[tag] = attribs
        else:
            dictionary[tag] = [dictionary[tag], attribs]
    return dictionary

namespace is the xmlns string, including braces, that ElementTree prepends to all tags, so here I've cleared it as there is one namespace for the entire document

NB that I adjusted the raw xml too, so that 'empty' tags would produce at most a ' ' text property in the ElementTree representation

spacepattern = re.compile(r'\s+')
mydictionary = xml_to_dictionary(ElementTree.XML(spacepattern.sub(' ', content)))

would give for instance

{'note': {'to': 'Tove',
         'from': 'Jani',
         'heading': 'Reminder',
         'body': "Don't forget me this weekend!"}}

it's designed for specific xml that is basically equivalent to json, should handle element attributes such as

<elementName attributeName='attributeContent'>elementContent</elementName>

too

there's the possibility of merging the attribute dictionary / subtag dictionary similarly to how repeat subtags are merged, although nesting the lists seems kind of appropriate :-)

0 讨论(0)

悲&欢浪女

2020-12-31 18:36

This'll get you a dict minus attributes... dunno if this is useful to anyone. I was looking for an xml to dict solution myself when i came up with this.



import xml.etree.ElementTree as etree

tree = etree.parse('test.xml')
root = tree.getroot()

def xml_to_dict(el):
  d={}
  if el.text:
    d[el.tag] = el.text
  else:
    d[el.tag] = {}
  children = el.getchildren()
  if children:
    d[el.tag] = map(xml_to_dict, children)
  return d

This: http://www.w3schools.com/XML/note.xml

<note>
 <to>Tove</to>
 <from>Jani</from>
 <heading>Reminder</heading>
 <body>Don't forget me this weekend!</body>
</note>

Would equal this:


{'note': [{'to': 'Tove'},
          {'from': 'Jani'},
          {'heading': 'Reminder'},
          {'body': "Don't forget me this weekend!"}]}

0 讨论(0)

1 2 下一页