问题
Hey I'm working on a Python project that requires I look through a webpage. I want to look through to find a specific text and if it finds the text, then it prints something out. If not, it prints out an error message. I've already tried with different modules such as libxml but I can't figure out how I would do it.
Could anybody lend some help?
回答1:
You could do something simple like:
import urllib2
import re
html_content = urllib2.urlopen('http://www.domain.com').read()
matches = re.findall('regex of string to find', html_content);
if len(matches) == 0:
print 'I did not find anything'
else:
print 'My string is in the html'
回答2:
lxml is awesome: http://lxml.de/parsing.html
I use it regularly with xpath for extracting data from the html.
The other option is http://www.crummy.com/software/BeautifulSoup/ which is great as well.
来源:https://stackoverflow.com/questions/4925966/searching-through-webpage