How i can get new ip from tor every requests in threads?

后端未结

关注

 2  1902

予麋鹿 2021-01-06 08:21

I try to use TOR proxy for scraping and everything works fine in one thread, but this is slow. I try to do something simple:

2条回答

囚心锁ツ (楼主)

2021-01-06 08:56
If you want different IPs for each connection, you can also use Stream Isolation over SOCKS by specifying a different proxy username:password combination for each connection.

With this method, you only need one Tor instance and each requests client can use a different stream with a different exit node.

In order to set this up, add unique proxy credentials for each requests.session object like so: socks5h://username:password@localhost:9050
```
import random
from multiprocessing import Pool
import requests

def check_ip():
    session = requests.session()
    creds = str(random.randint(10000,0x7fffffff)) + ":" + "foobar"
    session.proxies = {'http': 'socks5h://{}@localhost:9050'.format(creds), 'https': 'socks5h://{}@localhost:9050'.format(creds)}
    r = session.get('http://httpbin.org/ip')
    print(r.text)


with Pool(processes=8) as pool:
    for _ in range(9):
        pool.apply_async(check_ip)
    pool.close()
    pool.join()
```
Tor Browser isolates streams on a per-domain basis by setting the credentials to firstpartydomain:randompassword, where randompassword is a random nonce for each unique first party domain.

If you're crawling the same site and you want random IP's, then use a random username:password combination for each session. If you are crawling random domains and want to use the same circuit for requests to a domain, use Tor Browser's method of domain:randompassword for credentials.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...