Parsing HTML using Python

前端 未结 7 628
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-22 00:35

I\'m looking for an HTML Parser module for Python that can help me get the tags in the form of Python lists/dictionaries/objects.

If I have a document of the form:

相关标签:
7条回答
  • 2020-11-22 01:22

    So that I can ask it to get me the content/text in the div tag with class='container' contained within the body tag, Or something similar.

    try: 
        from BeautifulSoup import BeautifulSoup
    except ImportError:
        from bs4 import BeautifulSoup
    html = #the HTML code you've written above
    parsed_html = BeautifulSoup(html)
    print(parsed_html.body.find('div', attrs={'class':'container'}).text)
    

    You don't need performance descriptions I guess - just read how BeautifulSoup works. Look at its official documentation.

    0 讨论(0)
提交回复
热议问题