CNN Scraper sporadically working in python

前端未结

关注

 1  1241

I\'ve tried to create a Web Scraper for CNN. My goal is to scrap all news articles within the search query. Sometimes I get an output for some of the scraped pages and sometimes

相关标签:

1条回答

野的像风

2021-01-27 01:06

Call the back-end API directly. For more details check my previous answer

import requests
import json


def main(url):
    with requests.Session() as req:
        for item in range(1, 1000, 100):
            r = req.get(url.format(item)).json()
            for a in r['result']:
                print("Headline: {}, Url: {}".format(
                    a['headline'], a['url']))


main("https://search.api.cnn.io/content?q=coronavirus&sort=newest&category=business,us,politics,world,opinion,health&size=100&from={}")

0 讨论(0)