问题
Hi I am working on a project for my school that involves scraping off the HTML.
However I get none returned when I look for tables. Here is the segment that experiences the issue.
If you need more info I'd be happy to give it to you
from bs4 import BeautifulSoup
import urllib2
import datetime
#This section determines the date of the next Saturday which will go onto the end of the URL
d = datetime.date.today()
while d.weekday() != 5:
d += datetime.timedelta(1)
#temporary logic for testing when next webpage isn't out
d = "2013-06-01"
#Section that scrapes the data off the webpage
url = "http://www.sydgram.nsw.edu.au/co-curricular/sport/fixtures/" + str(d) + ".php"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup
#Section that grabs the table with stuff in it
table = soup.find('table', {"class": "excel1"})
print table
回答1:
BeautifulSoup is expecting a String of HTML. What you provide is a response object.
fetch the html from the response:
html = page.read()
and then hand html over to beautifulsoup or pass it directly however you like.
In addition id would be advisable to read the following two links:
urllib2 documentation
BeautifulSoup documentation
来源:https://stackoverflow.com/questions/16917124/beautiful-soup-returning-nothing