What can I do to improve socket performance in Python 3?

戏子无情 提交于 2019-12-23 04:01:05

问题


Initial Post

I have a very long running program where about 97% of the performance is tied up in socket objects created by ftp.retrlines and ftp.retrbinary calls. I have already used processes and threads to parallelize the program. Is there anything else I can do to eek out some more speed?

Example code:

# Get file list
ftpfilelist = []
ftp.retrlines('NLST %s' % ftp_directory, ftpfilelist.append)
... filter file list, this part takes almost no time ...
# Download a file
with open(path, 'wb') as fout:
    ftp.retrbinary('RETR %s' % ftp_path, fout.write)

Output from the cProfiler:

5890792 function calls (5888775 primitive calls) in 548.883 seconds

Ordered by: internal time
List reduced from 843 to 50 due to restriction <50>

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  9166  249.154    0.027  249.154    0.027 {method 'recv_into' of '_socket.socket' objects}
 99573  230.489    0.002  230.489    0.002 {method 'recv' of '_socket.socket' objects}
  1767   53.113    0.030   53.129    0.030 {method 'connect' of '_socket.socket' objects}
 98808    2.839    0.000    2.839    0.000 {method 'write' of '_io.BufferedWriter' objects}

Follow Up

Results for a gevent fork (https://github.com/fantix/gevent) supporting python 3.4.1:

7645675 function calls (7153156 primitive calls) in 301.813 seconds

Ordered by: internal time
List reduced from 948 to 50 due to restriction <50>

ncalls       tottime  percall  cumtime  percall filename:lineno(function)
107541/4418  281.228    0.003  296.499    0.067 gevent/hub.py:354(wait)
99885/59883    4.466    0.000  405.922    0.007 gevent/_socket3.py:248(recv)
99097          2.244    0.000    2.244    0.000 {method 'write' of '_io.BufferedWriter' objects}
111125/2796    1.036    0.000    0.017    0.000 gevent/hub.py:345(switch)
107543/2788    1.000    0.000    0.039    0.000 gevent/hub.py:575(get)

Results for concurrent.futures.ThreadPool:

5319963 function calls (5318875 primitive calls) in 359.541 seconds

Ordered by: internal time
List reduced from 872 to 50 due to restriction <50>

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    31  349.876   11.286  349.876   11.286 {method 'acquire' of '_thread.lock' objects}
  2652    3.293    0.001    3.293    0.001 {method 'recv' of '_socket.socket' objects}
310270    0.790    0.000    0.790    0.000 {method 'timetuple' of 'datetime.date' objects}
    25    0.661    0.026    0.661    0.026 {method 'recv_into' of '_socket.socket' objects}

Conclusion: For my use case, gevent improved performance by about 20%!


回答1:


Take a look into gevent. It can monkey patch any libraries you are using (such as your FTP lib), to improve socket performance by using cooperative threads.

The general premise is that threaded programs aren't very efficient with heavy I/O programs because the scheduler doesn't know if the thread is waiting on a network operation, and so the current thread may be scheduled but also wasting time waiting on I/O, while other threads could actually be doing work.

With gevent, as soon as your thread (called a greenlet) hits a blocking network call, it automatically switches to another greenlet. Through this mechanism, your threads/greenlets are used to their fullest potential.

Here's a great introduction to this library: http://www.gevent.org/intro.html#example




回答2:


It looks for me that cProfile is accounting the total time spend in functions, e.g. the time in user space and also the system time, where it waits in the kernel. This means that functions like retrbinary and retrlines will include the time it will need to get the data from the network and the slower your ftp server provides the data is, the more time will be spend in these functions.

I would recommend that you do a sanity check of your profiler results against a call of time(1) or use os.times(). You will probably see, that the process is spending most of the time waiting for data (system time) so there is not much you could optimize.



来源:https://stackoverflow.com/questions/24236271/what-can-i-do-to-improve-socket-performance-in-python-3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!