I\'m writing a scrapy spider to crawl for today\'s NYT articles from the homepage, but for some reason it doesn\'t follow any links. When I instantiate the link extractor i
I have found the solution to my problem. I was doing 2 things wrong:
CrawlSpider
rather than Spider
if I wanted it to automatically crawl sublinks.CrawlSpider
, I needed to use a callback function rather than overriding parse
. As per the docs, overriding parse
breaks CrawlSpider
functionality.