问题
I've tried many times to retrieve the table at this website: http://www.whoscored.com/Players/845/History/Tomas-Rosicky (the one under "Historical Participations")
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen('http://www.whoscored.com/Players/845/').read())
This is the Python code I am using to retrieve the table html, but I am getting an empty string. Help me out!
回答1:
The desired table is formed via an asynchronous API call to the http://www.whoscored.com/StatisticsFeed/1/GetPlayerStatistics
endpoint request to which returns a JSON response. In other words, urllib2
would return you an initial HTML content of the page without the "dynamic" part. In other words, urllib2
is not a browser.
You can study the request using browser developer tools:
Now, you need to simulate this request in your code. requests package is something you should consider using.
Here is a similar question about whoscored.com
I've answered before, there is a sample working code you can use as a starting point:
- XHR request URL says does not exist when attempting to parse it's content
来源:https://stackoverflow.com/questions/29375475/extracting-a-table-from-a-website