I am scraping 23770 webpages with a pretty simple web scraper using scrapy. I am quite new to scrapy and even python, but managed to write a spider that does the jo
Here's a collection of things to try:
CONCURRENT_REQUESTS_PER_DOMAIN, CONCURRENT_REQUESTS settings (docs)LOG_ENABLED = False (docs)yielding an item in a loop instead of collecting items into the items list and returning themScrapy on pypy, see Running Scrapy on PyPyHope that helps.