Can I resume a download from aws s3?

戏子无情 提交于 2020-12-13 06:31:08

问题


I am using the python boto3 library to download files from s3 to an IOT device on a cellular connection which is often slow and shaky.

Some files are quite large (250Mb, which for this scenario is large) and the network fails and the device reboots while downloading.

I would like to resume the download from the place it ended when the device rebooted. Is there any way to do it?

The aborted download does seem to keep downloaded data in a temporary file while downloading so the data is there.

The goal is to economize data transfer and make the download more resilient.

I am using multipart uploads but no resume happens by itself.

What i'm doing is something like this:

s3 = boto.resource('s3')
session = boto.session.Session(region_name='eu-central-1', profile_name=profile)
s3client = session.client( 's3', config=boto.session.Config(signature_version='s3v4'))
MB = 1024 ** 2

config = TransferConfig(
    multipart_threshold=10*MB,
    num_download_attempts=100)

def upload():
    s3client.upload_file(Filename=localfile, Bucket=bucket, Key=key, Config=config)

def download():
    s3client.download_file(bucket, key, localfile, Config=config )

# upload from server...
upload()

# .... later, from IOT device
download()

回答1:


I don't believe that boto3 has a resumable download feature.

You could potentially implement one yourself by making use of ranged gets. Find the size of the object upfront using head_object, then split that into N ranges, download them individually (maybe K chunks in parallel, depending on your hardware), store them on the local file system as chunks, and re-compose them into the final download when all chunks complete.

response = client.get_object(
    Bucket='mybucket',
    Key='mykey',
    Range='bytes=10001-20000'
)



回答2:


From terminal, you can use aws s3api for lower-level access to s3.

size=$(stat myfile.zip); aws s3api get-object --bucket BUCKETNAME --key myfile.zip --range "bytes=$size-" myfile.part; cat myfile.part >> myfile.zip

I think you can call this command via python. Not too hard.



来源:https://stackoverflow.com/questions/59125125/can-i-resume-a-download-from-aws-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!