Extracting a table from a website

假如想象 提交于 2019-12-24 07:24:42

问题


I've tried many times to retrieve the table at this website: http://www.whoscored.com/Players/845/History/Tomas-Rosicky (the one under "Historical Participations")

import urllib2 
from bs4 import BeautifulSoup 
soup = BeautifulSoup(urllib2.urlopen('http://www.whoscored.com/Players/845/').read())

This is the Python code I am using to retrieve the table html, but I am getting an empty string. Help me out!


回答1:


The desired table is formed via an asynchronous API call to the http://www.whoscored.com/StatisticsFeed/1/GetPlayerStatistics endpoint request to which returns a JSON response. In other words, urllib2 would return you an initial HTML content of the page without the "dynamic" part. In other words, urllib2 is not a browser.

You can study the request using browser developer tools:

Now, you need to simulate this request in your code. requests package is something you should consider using.

Here is a similar question about whoscored.com I've answered before, there is a sample working code you can use as a starting point:

  • XHR request URL says does not exist when attempting to parse it's content


来源:https://stackoverflow.com/questions/29375475/extracting-a-table-from-a-website

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!