Writing to a file on SFTP server opened using pysftp “open” method is slow

陌路散爱 提交于 2019-12-08 08:55:52

问题


I have a piece of Python code that works, but is very slow to write a Dataframe directly to an SFTP location. I am using pysftp and pandas.to_csv() to achieve the task of reading an Excel file from a remote location, run a few simple transformations and write it over to an SFTP location.

The code snippet is shared below which, takes 4 minutes 30 seconds precisely, to write 100 records to the SFTP location. An average Dataframe that I process has a maximum of 20 columns.

def dataframe_sftp_transfer(df,destination_path):
    cnopts = CnOpts()
    cnopts.hostkeys = None
    sftp = Connection('sftp3.server.com'
                    ,username= 'user'
                    ,password = 'pwd123'
                    ,cnopts=cnopts)
    with sftp.open(destination_path,'w+') as f:
        chunksize = 100
        with tqdm(total=len(df)) as progbar:
            df.to_csv(f,sep='~',index=False,chunksize=chunksize)
            progbar.update(chunksize)

Is there a better/faster way to achieve the aforesaid? Shouldn't writing files of the stated magnitude take only a couple of minutes?

Using a tool like FileZilla to put files in the remote SFTP location works much faster but, that sadly takes away any form of automation.


回答1:


You open the remote file without buffering. That way, every time the df.to_csv writes to the file, Paramiko/pysftp sends a request to the SFTP server and waits for a response. I do not know internals of df.to_csv, but it's likely it does one write per line (if not more). That would explain, why the upload is so slow. Particularly, if your connection to the server has high latency.

To enable buffered writes, use bufsize parameter of Connection.open:

with sftp.open(destination_path, 'w+', 32768) as f:

Obligatory warning: Do not set cnopts.hostkeys = None, unless you do not care about security. For the correct solution see Verify host key with pysftp.



来源:https://stackoverflow.com/questions/58111798/writing-to-a-file-on-sftp-server-opened-using-pysftp-open-method-is-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!