I\'m using scrapy to crawl my sitemap, to check for 404, 302 and 200 pages. But i can\'t seem to be able to get the response code. This is my code so far:
fr
http://readthedocs.org/docs/scrapy/en/latest/topics/spider-middleware.html#module-scrapy.contrib.spidermiddleware.httperror
Assuming default spider middleware is enabled, response codes outside of the 200-300 range are filtered out by HttpErrorMiddleware. You can tell the middleware you want to handle 404s by setting the handle_httpstatus_list attribute on your spider.
class TothegoSitemapHomesSpider(SitemapSpider):
handle_httpstatus_list = [404]