Scrapy retry or redirect middleware

前端未结

关注

 2  1669

执念已碎 2021-02-03 14:27

While crawling through a site with scrapy, I get redirected to a user-blocked page about 1/5th of the time. I lose the pages that I get redirected from when that happe

2条回答

[愿得一人] (楼主)

2021-02-03 14:56
You can handle 302 responses by adding handle_httpstatus_list = [302] at the beginning of your spider like so:
```
class MySpider(CrawlSpider):
    handle_httpstatus_list = [302]

    def parse(self, response):
        if response.status == 302:
            # Store response.url somewhere and go back to it later
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...