lxml.etree, element.text doesn't return the entire text from an element

后端 未结 8 848
梦毁少年i
梦毁少年i 2021-02-07 10:39

I scrapped some html via xpath, that I then converted into an etree. Something similar to this:

 text1  link  text2 
<         


        
8条回答
  •  梦毁少年i
    2021-02-07 10:56

    looks like an lxml bug to me, but according to design if you read the documentation. I've solved it like this:

    def node_text(node):
        if node.text:
            result = node.text
        else:
            result = ''
        for child in node:
            if child.tail is not None:
                result += child.tail
        return result
    

提交回复
热议问题