Why is elementtree.ElementTree.iterparse using so much memory?
I am using elementtree.ElementTree.iterparse to parse a large (371 MB) xml file. My code is basically this: outf = open('out.txt', 'w') context = iterparse('copyright.xml') context = iter(context) dummy, root = context.next() for event, elem in context: if elem.tag == 'foo': author = elem.text elif elem.tag == 'bar': if elem.text is not None and 'bat' in elem.text.lower(): outf.write(elem.text + '\n') elem.clear() #line A root.clear() #line B My question is two-fold: First - Do I need both A and B (see code snippet comments)? I was told that root.clear() clears unnecessary children so memory