page scraping to get prices from google finance

瘦欲@ 提交于 2019-12-03 09:12:11

The question is quite old but the selected answer is not valid anymore.
The API has been deprecated.

There is an open source project to scrape all companies from Google finance and match them with their current price at
The project solved most issues, includes caching, IP management and works stable without getting blocked.
It uses the internal finance company matching api to scrape companies and the chart api to get prices. However it is php code, not python. You can still learn how it solved the tasks and adapt it.

To get around most rate-limiting or bot-detection from the likes of Google or Wikipedia or Yahoo, spoof your user-agent.

This will make your script's requests appear to be from the latest version of Google Chrome.

headers = {'User-Agent' : "Mozilla/5.0 (Windows NT 6.0; WOW64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.16 Safari/534.24"}
req = urllib2.Request(url,None,headers)
content = urllib2.urlopen(req).read()

Yahoo Finance is also a good place to get financial information which covers more countries and stocks.

For python 2, you can use ystockquote. For python 3, you can use yfq that I rewrite from the previous one.

To get current quotes of Google and Intel.

>>> import yfq
>>> yfq.get_price('GOOG+INTL')
{'GOOG': '600.25', 'INTL': '22.25'}

To get historical quotes of Yahoo from March 3, 2012 to March 5, 2012.

>>> yfq.get_historical_prices('YHOO','20120301','20120303')
[['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], ['2012-03-02', '14.89', '14.92', '14.66', '14.72', '9164900', '14.72'], ['2012-03-01', '14.89', '14.96', '14.79', '14.93', '12283300', '14.93']]