Need to find text with RegEx and BeautifulSoup

前端 未结 1 1777
轻奢々
轻奢々 2021-01-12 14:06

I\'m trying to parse a website to pull out some data that is stored in the body such as this:


    INFORMATION
    Hookups: No         


        
相关标签:
1条回答
  • 2021-01-12 14:43

    BeautifulSoup's find_all only works with tags. You can actually use just a pure regex to get what you need assuming the HTML is this simple. Otherwise you can use find_all and then get the .text nodes.

    re.findall("Hookups: (.*)", open('doc.html').read())
    

    You can also search by tag content with the text property as of BeautifulSoup 4.2

    soup.find_all(text=re.compile("Hookups:(.*)Group"));
    
    0 讨论(0)
提交回复
热议问题