I\'m trying to parse a website to pull out some data that is stored in the body such as this:
INFORMATION
Hookups: No
BeautifulSoup's find_all
only works with tags. You can actually use just a pure regex to get what you need assuming the HTML is this simple. Otherwise you can use find_all
and then get the .text
nodes.
re.findall("Hookups: (.*)", open('doc.html').read())
You can also search by tag content with the text
property as of BeautifulSoup 4.2
soup.find_all(text=re.compile("Hookups:(.*)Group"));