Scrapy Spider not Following Links

前端未结

关注

 1  1291

I\'m writing a scrapy spider to crawl for today\'s NYT articles from the homepage, but for some reason it doesn\'t follow any links. When I instantiate the link extractor i

相关标签:

1条回答

南笙

2021-01-06 20:06
I have found the solution to my problem. I was doing 2 things wrong:
1. I needed to subclass CrawlSpider rather than Spider if I wanted it to automatically crawl sublinks.
2. When using CrawlSpider, I needed to use a callback function rather than overriding parse. As per the docs, overriding parse breaks CrawlSpider functionality.
0 讨论(0)
发布评论:

提交评论
- 加载中...