Rendered HTML to plain text using Python

后端 未结 2 799
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-29 18:49

I\'m trying to convert a chunk of HTML text with BeautifulSoup. Here is an example:

Some text more text

2条回答
  •  时光说笑
    2020-12-29 19:16

    BeautifulSoup is a scraping library, so it's probably not the best choice for doing HTML rendering. If it's not essential to use BeautifulSoup, you should take a look at html2text. For example:

    import html2text
    html = open("foobar.html").read()
    print html2text.html2text(html)
    

    This outputs:

    Some text more text even more text
    
      * list item
      * yet another list item
    
    Some other text
    
      * list item
      * yet another list item
    

提交回复
热议问题