Strip HTML from strings in Python

前端未结

关注

 26  2354

难免孤独 2020-11-22 02:50

from mechanize import Browser
br = Browser()
br.open(\'http://somewebpage\')
html = br.response().readlines()
for line in html:
  print line

When p

26条回答

梦毁少年i (楼主)

2020-11-22 03:08

Here's a solution similar to the currently accepted answer (https://stackoverflow.com/a/925630/95989), except that it uses the internal HTMLParser class directly (i.e. no subclassing), thereby making it significantly more terse:

def strip_html(text):
    parts = []                                                                      
    parser = HTMLParser()                                                           
    parser.handle_data = parts.append                                               
    parser.feed(text)                                                               
    return ''.join(parts)

0 讨论(0)

查看其它26个回答