I have a problem with scrapy. In a request fails (eg 404,500), how to ask for another alternative request? Such as two links can obtain price info, the one failed, request anoth
Just set handle_httpstatus_list = [404, 500]
and check for the status code in parse
method. Here's an example:
from scrapy.http import Request
from scrapy.spider import BaseSpider
class MySpider(BaseSpider):
handle_httpstatus_list = [404, 500]
name = "my_crawler"
start_urls = ["http://github.com/illegal_username"]
def parse(self, response):
if response.status in self.handle_httpstatus_list:
return Request(url="https://github.com/kennethreitz/", callback=self.after_404)
def after_404(self, response):
print response.url
# parse the page and extract items
Also see:
Hope that helps.