Rotating Proxies for web scraping

后端 未结 3 742
感动是毒
感动是毒 2021-01-30 18:50

I\'ve got a python web crawler and I want to distribute the download requests among many different proxy servers, probably running squid (though I\'m open to alternatives). For

3条回答
  •  春和景丽
    2021-01-30 19:42

    Make your crawler have a list of proxies and with each HTTP request let it use the next proxy from the list in a round robin fashion. However, this will prevent you from using HTTP/1.1 persistent connections. Modifying the proxy list will eventually result in using a new or not using a proxy.

    Or have several connections open in parallel, one to each proxy, and distribute your crawling requests to each of the open connections. Dynamics may be implemented by having the connetor registering itself with the request dispatcher.

提交回复
热议问题