getting Forbidden by robots.txt: scrapy

前端 未结 3 1705
忘了有多久
忘了有多久 2020-12-01 05:58

while crawling website like https://www.netflix.com, getting Forbidden by robots.txt: https://www.netflix.com/>

ERROR: No response downloaded for: https://www.netfli

3条回答
  •  有刺的猬
    2020-12-01 06:39

    In the new version (scrapy 1.1) launched 2016-05-11 the crawl first downloads robots.txt before crawling. To change this behavior change in your settings.py with ROBOTSTXT_OBEY

    ROBOTSTXT_OBEY = False
    

    Here are the release notes

提交回复
热议问题