Preserving original doctype and declaration of an lxml.etree parsed xml

后端 未结 2 568
情话喂你
情话喂你 2021-01-01 18:28

I\'m using python\'s lxml and I\'m trying to read an xml document, modify and write it back but the original doctype and xml declaration disappears. I\'m wondering if there\

2条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-01 19:11

    You can also preserve DOCTYPE and the XML declaration with fromstring():

    import sys
    from StringIO import StringIO
    from lxml import etree
    
    xml = r'''
    
    
     
     example
     
     
     

    This is an example

    ''' tree = etree.fromstring(xml).getroottree() # or etree.parse(file) tree.write(sys.stdout, xml_declaration=True, encoding=tree.docinfo.encoding)

    Output

    
    
    
     
     example
     
     
     

    This is an example

    Note the xml declaration (with correct encoding) and doctype are present. It even (possibly incorrectly) uses ' instead of " in the xml declaration and adds Content-Type to the .

    For the @John Keyes' example input it produces the same results as etree.tostring() in the answer.

提交回复
热议问题