Scrapy request+response+download time

前端 未结 3 1450
忘了有多久
忘了有多久 2021-02-14 01:28

UPD: Not close question because I think my way is not so clear as should be

Is it possible to get current request + response + download time for saving

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-14 01:52

    I think the best solution is by using scrapy signals. Whenever the request reaches the downloader it emits request_reached_downloader signal. After download it emits response_downloaded signal. You can catch it from the spider and assign time and its differences to meta from there.

    @classmethod
        def from_crawler(cls, crawler, *args, **kwargs):
            spider = super(SignalSpider, cls).from_crawler(crawler, *args, **kwargs)
            crawler.signals.connect(spider.item_scraped, signal=signals.item_scraped)
            return spider
    

    More elaborate answer is on here

提交回复
热议问题