Parsing a file with multiple xmls in it

前端 未结 4 1032
半阙折子戏
半阙折子戏 2021-01-15 11:43

Is there a way to parse a file which contains multiple xmls in it?

eg., if I have a file called stocks.xml and within the stocks.xml i have more than one xml content

4条回答
  •  一整个雨季
    2021-01-15 12:25

    So you have a file containing multiple XML documents one after the other? Here is an example which strips out the PIs and wraps the data in a root tag to parse the whole thing as a single XML document:

    import re
    import lxml.etree
    
    re_strip_pi = re.compile('<\?xml [^?>]+\?>', re.M)
    data = '' + open('stocks.xml', 'rb').read() + ''
    match = re_strip_pi.search(data)
    data = re_strip_pi.sub('', data)
    tree = lxml.etree.fromstring(match.group() + data)
    for prod in tree.xpath('//PRODUCT'):
        print prod
    

提交回复
热议问题