Speed up web scraper

前端未结

关注

 4  1677

野趣味 2021-01-30 03:45

I am scraping 23770 webpages with a pretty simple web scraper using scrapy. I am quite new to scrapy and even python, but managed to write a spider that does the jo

4条回答

春和景丽 (楼主)

2021-01-30 04:18
One workaround to speed up your scrapy is to config your start_urls appropriately.

For example, If our target data is in http://apps.webofknowledge.com/doc=1 where the doc number range from 1 to 1000, you can config your start_urls in followings:
```
 start_urls = [
    "http://apps.webofknowledge.com/doc=250",
    "http://apps.webofknowledge.com/doc=750",
]
```
In this way, requests will start from 250 to 251,249 and from 750 to 751,749 simultaneously, so you will get 4 times faster compared to start_urls = ["http://apps.webofknowledge.com/doc=1"].
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...