How to write a proxy pool server (when a request comes, choose a proxy to get url content) in python?

旧巷老猫 提交于 2019-12-08 15:48:46

问题


I do not know what the proper name is for such proxy server, you're welcome to fix my question title.

When I search proxy server on google, a lot implements like maproxy or a-python-proxy-in-less-than-100-lines-of-code. Those proxies server seems just ask remote server to get a certain url address.

I want to build a proxy server, which contains a proxy pool(a list of http/https proxies) and only have one IP address and one port to serve incoming requests. When a request comes, it would choose a proxy from the pool and do this request, and return result back.

For example I have a VPS which IP '192.168.1.66'. I start proxy server at this VPS with IP '127.0.0.1' and port '8080'.

I can then use this proxy like below.

import requests
url = 'http://www.google.com'
headers = {
    ...
}
proxies = {
    'http': 'http://192.168.1.66:8080'
}

r = requests.get(url, headers=headers, proxies=proxies)

I have see some impelement like:

from twisted.web import proxy, http
from twisted.internet import reactor
from twisted.python import log
import sys
log.startLogging(sys.stdout)

class ProxyFactory(http.HTTPFactory):
    protocol = proxy.Proxy

reactor.listenTCP(8080, ProxyFactory())
reactor.run()

It works, but it is so simple that I have no idea how it works and how to improve this code to use a proxy pool.

An example flow :

from hidu/proxy-manager , which write by golang .

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+ client (want visit http://www.baidu.com/)              +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
                        |  
                        |  via proxy 127.0.0.1:8090  
                        |  
                        V  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+                       +         proxy pool             +  
+ proxy manager listen  ++++++++++++++++++++++++++++++++++  
+ on (127.0.0.1:8090)   +  http_proxy1,http_proxy2,      +  
+                       +  socks5_proxy1,socks5_proxy2   +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
                        |  
                        |  choose one proxy visit 
                        |  www.baidu.com  
                        |  
                        V  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+        site:www.baidu.com                              +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  

回答1:


Your Proxy Pool concept is not hard to implement. If I understand correctly, you want to make following.

  1. YOUR PROXY SERVER listening requests on 192.168.1.66:8080
  2. CLIENT requests to access http://www.google.com
  3. YOUR PROXY SERVER sends CLIENT's request to ANOTHER PROXY SERVER, which is in list of ANOTHER PROXY SERVER - PROXY POOL.
  4. YOUR PROXY SERVER gets response from ANOTHER PROXY SERVER, and respond to CLIENT

So, I've write simple proxy server using Flask and Requests.

from flask import Flask, Response
import random

app = Flask(__name__)

@app.route('/p/<path:url>')
def proxy(url):
    """ Request to this like /p/www.google.com
    """
    url = 'http://{}'.format(url)
    r = get_response(url)

    return Response(stream_with_context(r.iter_content()), 
                    content_type=r.headers['content-type'])

def get_proxy():
    # This is your "Proxy Pool"
    proxies = [
        'http://proxy-server-1.com',
        'http://proxy-server-2.com',
        'http://proxy-server-3.com',
    ]

    return random.choice(proxies)

def get_response(target_url):
    proxy = get_proxy();
    url = "{}/p/{}".format(proxy, target_url)
    # Above line will generate like http://proxy-server-1.com/p/www.google.com

    return requests.get(url, stream=True)

if __name__ == '__main__':
    app.run()

Then, you can start here to improve your proxy server.

Common Proxy Pool, or Proxy Manager can check availability, speed, and more stats of it's proxies, and select best proxy to send request. And of course, this example handle only simple request, and you can add features handle request args, methods, protocols.

Hope this helpful!



来源:https://stackoverflow.com/questions/33139697/how-to-write-a-proxy-pool-server-when-a-request-comes-choose-a-proxy-to-get-ur

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!