问题
I'm working on a project that involves converting a large amount of HTML content to plain/text. I have a custom-written module that does the job OK, but I'm wondering if there's some standard tools to help get the job done.
回答1:
Html2Text seems to be a good option
回答2:
Here's a python library which does HTML parsing:
- lxml.html
BeautifulSoup is another option.
来源:https://stackoverflow.com/questions/1668081/best-way-to-convert-html-to-plaintext-using-python