memory leak parsing xml using xml.dom.minidom

我们两清 提交于 2019-12-01 11:06:09

问题


I'm using xml.dom.minidom to parse xml files, somewhat like this:

import xml.dom.minidom as dom

file= open('file.xml')
doc= dom.parse(file)
# SNIP
doc.unlink()

Even after unlinking the document, the memory usage is at about 120 MiB. When one is actually using the program, causing multiple xml files to be parsed, memory usage climbs to about 300 MiB, which is unacceptable.

I'm sure the memory leak isn't caused by my code, but by minidom, because even doing just

doc= dom.parse(file)
doc.unlink()

produces the same result.

Am I doing something wrong, or is this a bug in minidom?

P.S.: I'd prefer to stick to minidom, because there's a lot of xml parsing happening in my code, and I'd rather not completely rewrite all of it, but I will do it if there's no other choice.


回答1:


I am also observing the same issues with minidom! And we are not alone. See for example here.

There it is suggested to use an other XML implementations with python binding like

  • xml.etree.ElementTree: alternative implementation in the Python standard library
  • libxml2: XML C parser with python bindings
  • lxml: a more pythonic binding to libxml2


来源:https://stackoverflow.com/questions/26787026/memory-leak-parsing-xml-using-xml-dom-minidom

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!