XML parsing in Python for big data

前端 未结 1 610
陌清茗
陌清茗 2021-01-24 07:27

I am trying to parse an XML file using Python. But the problem is that the XML file size is around 30GB. So, it\'s taking hours to execute:

tree = ET.parse(\'Pos         


        
相关标签:
1条回答
  • 2021-01-24 08:13

    You'll want an XML parsing mechanism that doesn't load everything into memory.

    You can use ElementTree.iterparse or you could use Sax.

    Here is a page with some XML processing tutorials for Python.

    UPDATE: As @marbu said in the comment, if you use ElementTree.iterparse be sure to use it in such a way that you get rid of elements in memory when you've finished processing them.

    0 讨论(0)
提交回复
热议问题