Finding next occurring tag and its enclosed text with Beautiful Soup

后端 未结 1 999
太阳男子
太阳男子 2021-01-08 00:54

I\'m trying to parse text between the tag

. When I type soup.blockquote.get_text().

I get the result I want for the fir

相关标签:
1条回答
  • 2021-01-08 01:40

    Use find_next_sibling (If it not a sibling, use find_next instead)

    >>> html = '''
    ... <html>
    ... <head>header
    ... </head>
    ... <blockquote>blah blah
    ... </blockquote>
    ... <p>eiaoiefj</p>
    ... <blockquote>capture this next
    ... </blockquote>
    ... <p></p><strong>don'tcapturethis</strong>
    ... <blockquote>
    ... capture this too but separately after "capture this next"
    ... </blockquote>
    ... </html>
    ... '''
    
    >>> from bs4 import BeautifulSoup
    >>> soup = BeautifulSoup(html)
    >>> quote1 = soup.blockquote
    >>> quote1.text
    u'blah blah\n'
    >>> quote2 = quote1.find_next_siblings('blockquote')
    >>> quote2.text
    u'capture this next\n'
    
    0 讨论(0)
提交回复
热议问题