How to prevent my app from hanging when parallelising paramiko.SFTPClient.get requests?

心已入冬 提交于 2021-01-01 09:11:36

问题


I am trying to parallelise retrieval of files from a server via SFTP and upload to AWS. I am using python multi-threading, the upload part works fine, however, I noticed that the get operation from paramiko.SFTPClient keeps the program hanging at the end. In fact, all of the files are withdrawn and uploaded but the program doesn't exit. I tried many things from similar posts but nothing work, my pseudo-code is the following, any help would be welcome:

def create_sftp_connection(host, port, username, password):
    transport = paramiko.Transport((host, port))
    transport.connect(username, password)
    sftp_client = paramiko.SFTPClient.from_transport(transport)

def get_and_upload_file(s3, sftp_client, file, local_full_path, destination_bucket, cloud_path):
     sftp_client.get(file, local_full_path)
     upload_file_to_s3(s3, local_full_path, destination_bucket, cloud_path)

def transfer_files(sftp_client, remote_path, local_path, destination_bucket):
     all_files = get_files_to_transfer(sftp_client, remote_path)
     s3 = init_s3()
     threads = list()
    
     for file in all_files:
         ....
         thread = threading.Thread(target=get_and_upload_file, args=(s3, sftp_client, file, local_full_path, destination_bucket, cloud_path))
         thread.daemon = True
         threads.append(thread)
         thread.start()
        
      for thread in threads:
           thread.join()


if __name__ == "__main__":
     sftp_client = create_sftp_connection(host, port, username, password)
     transfer_files(sftp_client, remote_path, local_path, destination_bucket)

Note: I also tried to wait on the thread to stop using:

for thread in threads:
    while thread.is_alive():
          thread.join(timeout=0.1)

回答1:


I'm pretty sure that Paramiko is not thread safe.

You will most likely need to create a separate connection (Transport) for each thread.


Create few connections and let them pick files from a shared list/queue. It makes no sense to upload more then few files in parallel anyway.



来源:https://stackoverflow.com/questions/58604999/how-to-prevent-my-app-from-hanging-when-parallelising-paramiko-sftpclient-get-re

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!