I\'m trying to parse text between the tag . When I type
soup.blockquote.get_text()
.
I get the result I want for the fir
Use find_next_sibling (If it not a sibling, use find_next instead)
>>> html = '''
... <html>
... <head>header
... </head>
... <blockquote>blah blah
... </blockquote>
... <p>eiaoiefj</p>
... <blockquote>capture this next
... </blockquote>
... <p></p><strong>don'tcapturethis</strong>
... <blockquote>
... capture this too but separately after "capture this next"
... </blockquote>
... </html>
... '''
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html)
>>> quote1 = soup.blockquote
>>> quote1.text
u'blah blah\n'
>>> quote2 = quote1.find_next_siblings('blockquote')
>>> quote2.text
u'capture this next\n'