Strip HTML from strings in Python

前端 未结 26 2354
难免孤独
难免孤独 2020-11-22 02:50
from mechanize import Browser
br = Browser()
br.open(\'http://somewebpage\')
html = br.response().readlines()
for line in html:
  print line

When p

26条回答
  •  梦毁少年i
    2020-11-22 03:08

    Here's a solution similar to the currently accepted answer (https://stackoverflow.com/a/925630/95989), except that it uses the internal HTMLParser class directly (i.e. no subclassing), thereby making it significantly more terse:

    def strip_html(text):
        parts = []                                                                      
        parser = HTMLParser()                                                           
        parser.handle_data = parts.append                                               
        parser.feed(text)                                                               
        return ''.join(parts)
    

提交回复
热议问题