Reading file opened with Python Paramiko SFTPClient.open method is slow

↘锁芯ラ 提交于 2020-01-13 18:01:10

问题


I am trying to remote read a netcdf file.
I used paramiko package to read my file, like this:

import paramiko
from netCDF4 import Dataset

client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=’hostname’, username=’usrname’, password=’mypassword’)

sftp_client = client.open_sftp()
ncfile = sftp_client.open('mynetCDFfile')
b_ncfile = ncfile.read()    # ****

nc = Dataset('test.nc', memory=b_ncfile)

But the run speed of ncfile.read() is VERY SLOW.
So my question is: is there any alternative way to read a netcdf file remotely, or does it has any approach to speed up paramiko.sftp_file.SFTPFile.read()?


回答1:


Calling SFTPFile.prefetch should increase the read speed:

ncfile = sftp_client.open('mynetCDFfile')
ncfile.prefetch()
b_ncfile = ncfile.read()

Another option is enabling read buffering, using bufsize parameter of SFTPClient.open:

ncfile = sftp_client.open('mynetCDFfile', bufsize=32768)
b_ncfile = ncfile.read()

(32768 is a value of SFTPFile.MAX_REQUEST_SIZE)


Yet another option is to explicitly specify the amount of data to read (it makes BufferedFile.read take a more efficient code path):

ncfile = sftp_client.open('mynetCDFfile')
b_ncfile = ncfile.read(ncfile.stat().st_size)

Obligatory warning: Do not use AutoAddPolicy this way – You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".



来源:https://stackoverflow.com/questions/58433996/reading-file-opened-with-python-paramiko-sftpclient-open-method-is-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!