Empty lines while using minidom.toprettyxml

末鹿安然 提交于 2019-11-29 11:31:48

问题


I've been using a minidom.toprettyxml for prettify my xml file. When I'm creating XML file and using this method, all works grate, but if I use it after I've modified the xml file (for examp I've added an additional nodes) and then I'm writing it back to XML, I'm getting empty lines, each time I'm updating it, I'm getting more and more empty lines...

my code :

file.write(prettify(xmlRoot))


def prettify(elem):
    rough_string = xml.tostring(elem, 'utf-8') //xml as ElementTree
    reparsed = mini.parseString(rough_string) //mini as minidom
    return reparsed.toprettyxml(indent=" ")

and the result :

<?xml version="1.0" ?>
<testsuite errors="0" failures="3" name="TestSet_2013-01-23 14_28_00.510935" skip="0"     tests="3" time="142.695" timestamp="2013-01-23 14:28:00.515460">




    <testcase classname="TC test" name="t1" status="Failed" time="27.013"/>




    <testcase classname="TC test" name="t2" status="Failed" time="78.325"/>


    <testcase classname="TC test" name="t3" status="Failed" time="37.357"/>
</testsuite>

any suggestions ?

thanks.


回答1:


I found a solution here: http://code.activestate.com/recipes/576750-pretty-print-xml/

Then I modified it to take a string instead of a file.

from xml.dom.minidom import parseString

pretty_print = lambda data: '\n'.join([line for line in parseString(data).toprettyxml(indent=' '*2).split('\n') if line.strip()])

Output:

<?xml version="1.0" ?>
<testsuite errors="0" failures="3" name="TestSet_2013-01-23 14_28_00.510935" skip="0" tests="3" time="142.695" timestamp="2013-01-23 14:28:00.515460">
  <testcase classname="TC test" name="t1" status="Failed" time="27.013"/>
  <testcase classname="TC test" name="t2" status="Failed" time="78.325"/>
  <testcase classname="TC test" name="t3" status="Failed" time="37.357"/>
</testsuite>

This may help you work it into your function a little be easier:

def new_prettify():
    reparsed = parseString(CONTENT)
    print '\n'.join([line for line in reparsed.toprettyxml(indent=' '*2).split('\n') if line.strip()])



回答2:


I found an easy solution for this problem, just with changing the last line of you prettify() so it will be:

def prettify(elem):
rough_string = xml.tostring(elem, 'utf-8') //xml as ElementTree
reparsed = mini.parseString(rough_string) //mini as minidom
return reparsed.toprettyxml(indent=" ", newl='')



回答3:


use this to resolve problem with the lines

toprettyxml(indent=' ', newl='\r', encoding="utf-8")




回答4:


I am having the same issue with Python 2.7 (32b) in a Windows 10 machine. The issue seems to be that when python parses an XML text to an ElementTree object, it adds some annoying line feeds to either the "text" or "tail" attributes of each element.

This script removes such line break characters:

def removeAnnoyingLines(elem):
    hasWords = re.compile("\\w")
    for element in elem.iter():
        if not re.search(hasWords,str(element.tail)):
            element.tail=""
        if not re.search(hasWords,str(element.text)):
            element.text = ""

Use this function before "pretty-printing" your tree:

removeAnnoyingLines(element)
myXml = xml.dom.minidom.parseString(xml.etree.ElementTree.tostring(element))
print myXml.toprettyxml()

It worked for me. I hope it works for you!



来源:https://stackoverflow.com/questions/14479656/empty-lines-while-using-minidom-toprettyxml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!