CNN Scraper sporadically working in python

前端 未结 1 1240
旧巷少年郎
旧巷少年郎 2021-01-27 00:34

I\'ve tried to create a Web Scraper for CNN. My goal is to scrap all news articles within the search query. Sometimes I get an output for some of the scraped pages and sometimes

相关标签:
1条回答
  • 2021-01-27 01:06

    Call the back-end API directly. For more details check my previous answer

    import requests
    import json
    
    
    def main(url):
        with requests.Session() as req:
            for item in range(1, 1000, 100):
                r = req.get(url.format(item)).json()
                for a in r['result']:
                    print("Headline: {}, Url: {}".format(
                        a['headline'], a['url']))
    
    
    main("https://search.api.cnn.io/content?q=coronavirus&sort=newest&category=business,us,politics,world,opinion,health&size=100&from={}")
    
    0 讨论(0)
提交回复
热议问题