Web scraping - how to access content rendered in JavaScript via Angular.js?

前端 未结 1 1873
既然无缘
既然无缘 2020-12-03 07:41

I\'m trying to scrape data from the public site asx.com.au

The page http://www.asx.com.au/asx/research/company.do#!/ACB/details contains a div with clas

相关标签:
1条回答
  • 2020-12-03 07:44

    This page use JavaScript to read data from server and fill page.

    I see you use developer tools in Chrome - see in tab Network on XHR or JS requests.

    I found this url:

    http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices&callback=angular.callbacks._0

    This url gives all data almost in JSON format

    But if you use this link without &callback=angular.callbacks._0 then you get data in pure JSON format and you will could use json module to convert it to python dictionary.


    EDIT: working code

    import urllib2
    from bs4 import BeautifulSoup
    import json
    
    # new url      
    url = 'http://data.asx.com.au/data/1/company/ACB?fields=primary_share,latest_annual_reports,last_dividend,primary_share.indices'
    
    # read all data
    page = urllib2.urlopen(url).read()
    
    # convert json text to python dictionary
    data = json.loads(page)
    
    print(data['principal_activities'])
    

    Output:

    Mineral exploration in Botswana, China and Australia.
    
    0 讨论(0)
提交回复
热议问题