问题
Let's say I have a list that contains 10,000+ proxies
proxy_list = ['ip:port','ip:port',.....10,000+ items]
How do I iterate it to get the proxies that works for my pc? Using the following code it is possible to find it , but takes 5*10,000 seconds to get completed. How would I iterate through the list faster?
import requests
result=[]
for I in proxy_list:
try:
requests.get('http:\\www.httpbin.org\ip',proxies = {'https' : I, 'http' : I } ,timeout = 5)
result.append(I)
except:
pass
回答1:
You could use threading, this would allow the program to check multiple proxies at once.
import requests
import threading
import concurrent.futures
appendLock = threading.Lock() """This is to keep multiple threads from appending
to the list at the same time"""
workers = 10 """This is the number of threads that will iterate through your proxy list.
In my experience, increasing this number higher than 30 causes problems."""
proxy_list = ['ip:port','ip:port',.....10,000+ items]
result = []
def proxyCheck(proxy):
try:
requests.get('http://www.httpbin.org/ip',proxies = {'https' : I, 'http' : I } ,timeout = 5)
with appendLock:
result.append(I)
except:
pass
with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as executor:
for proxy in proxy_list:
executor.submit(proxyCheck(proxy))
来源:https://stackoverflow.com/questions/61012763/fastest-proxy-iteration-in-python