How can I get the text between tags using python SAX parser?

前端 未结 1 1054
南方客
南方客 2021-01-18 05:37

What I need is just get the text of the corresponding tag and persist it into database. Since the xml file is big (4.5GB) I\'m using sax. I used the characters meth

1条回答
  •  走了就别回头了
    2021-01-18 06:15

    The text in the tag is chunked by the SAX processor. characters might be called multiple times.

    You need to do something like:

    def startElement(self, name, attrs):
        self.map[name] = ''
        self.tag = name
    
    def characters(self, content):
        self.map[self.tag] += content
    
    def endElement(self, name):
        print self.map[name]
    

    0 讨论(0)
提交回复
热议问题