While crawling through a site with scrapy, I get redirected to a user-blocked page about 1/5th of the time. I lose the pages that I get redirected from when that happe
You can handle 302 responses by adding handle_httpstatus_list = [302] at the beginning of your spider like so:
class MySpider(CrawlSpider):
handle_httpstatus_list = [302]
def parse(self, response):
if response.status == 302:
# Store response.url somewhere and go back to it later