How do I improve scrapy's download speed?

前端 未结 1 687
粉色の甜心
粉色の甜心 2021-01-31 23:21

I\'m using scrapy to download pages from many different domains in parallel. I have hundreds of thousands of pages to download, so performance is important.

Unfortunate

相关标签:
1条回答
  • 2021-02-01 00:21

    I had this problem in the past... And large part of it I solved with a 'Dirty' old tricky.

    Do a local cache DNS.

    Mostly when you have this high cpu usage accessing simultaneous remote sites it is because scrapy is trying to resolve the urls.

    And please remember to change your dns settings at the host (/etc/resolv.conf) to your LOCAL caching DNS server.

    In the first ones will be slowly, but as soon it start caching and it is more efficient resolving you are going to see HUGE improvements.

    I hope this will help you in your problem!

    0 讨论(0)
提交回复
热议问题