Parsing XML with namespace in Python via 'ElementTree'

前端 未结 6 1695
臣服心动
臣服心动 2020-11-21 09:48

I have the following XML which I want to parse using Python\'s ElementTree:



        
6条回答
  •  旧巷少年郎
    2020-11-21 10:31

    Here's how to do this with lxml without having to hard-code the namespaces or scan the text for them (as Martijn Pieters mentions):

    from lxml import etree
    tree = etree.parse("filename")
    root = tree.getroot()
    root.findall('owl:Class', root.nsmap)
    

    UPDATE:

    5 years later I'm still running into variations of this issue. lxml helps as I showed above, but not in every case. The commenters may have a valid point regarding this technique when it comes merging documents, but I think most people are having difficulty simply searching documents.

    Here's another case and how I handled it:

    
    content
    

    xmlns without a prefix means that unprefixed tags get this default namespace. This means when you search for Tag2, you need to include the namespace to find it. However, lxml creates an nsmap entry with None as the key, and I couldn't find a way to search for it. So, I created a new namespace dictionary like this

    namespaces = {}
    # response uses a default namespace, and tags don't mention it
    # create a new ns map using an identifier of our choice
    for k,v in root.nsmap.iteritems():
        if not k:
            namespaces['myprefix'] = v
    e = root.find('myprefix:Tag2', namespaces)
    

提交回复
热议问题