Parsing CDATA in xml with python

后端 未结 1 702
醉酒成梦
醉酒成梦 2020-12-30 08:46

I need to parse an XML file with a number of blocks of CDATA that I need to retain for later plotting:

相关标签:
1条回答
  • 2020-12-30 09:15

    Here are two examples of how to do it:

    from lxml import etree
    import xml.etree.ElementTree as ElementTree
    
    CONTENT = """
    <process id="process1">
     <log name="name1" device="device1"><![CDATA[timestamp value]]></log>
     <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]></log>
    </process>
    """
    
    def parse_with_lxml():
        root = etree.fromstring(CONTENT)
        for log in root.xpath("//log"):
            print log.text
    
    def parse_with_stdlib():
        root = ElementTree.fromstring(CONTENT)
        for log in root.iter('log'):
            print log.text
    
    if __name__ == '__main__':
        parse_with_lxml()
        parse_with_stdlib()
    

    Output:

    timestamp value
    timestamp value, timestamp value, timestamp
    timestamp value
    timestamp value, timestamp value, timestamp
    

    The text attribute it handles it in both cases.

    0 讨论(0)
提交回复
热议问题