How to remove content in nested tags with BeautifulSoup?

前端 未结 3 1273
青春惊慌失措
青春惊慌失措 2021-01-20 15:34

How to remove content in nested tags with BeautifulSoup? These posts showed the reverse to retrieve the content in nested tags: How to get contents of nested ta

3条回答
  •  一个人的身影
    2021-01-20 16:02

    You can check for bs4.element.NavigableString on children:

    from bs4 import BeautifulSoup as bs
    import bs4
    html = "Something something  blah blah something GONE! else"
    def get_only_text(elem):
        for item in elem.children:
            if isinstance(item,bs4.element.NavigableString):
                yield item
    
    print ''.join(get_only_text(bs(html).find_all('foo')[0]))
    

    Output;

    Something something  something  else
    

提交回复
热议问题