I\'m trying to scrape data from the public site asx.com.au
The page http://www.asx.com.au/asx/research/company.do#!/ACB/details contains a div
with clas
This page use JavaScript
to read data from server and fill page.
I see you use developer tools in Chrome
- see in tab Network
on XHR
or JS
requests.
I found this url:
http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices&callback=angular.callbacks._0
This url gives all data almost in JSON format
But if you use this link without &callback=angular.callbacks._0
then you get data in pure JSON format and you will could use json
module to convert it to python dictionary.
EDIT: working code
import urllib2
from bs4 import BeautifulSoup
import json
# new url
url = 'http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices'
# read all data
page = urllib2.urlopen(url).read()
# convert json text to python dictionary
data = json.loads(page)
print(data['principal_activities'])
Output:
Mineral exploration in Botswana, China and Australia.