问题
I have about 50GB data to upload to S3 bucket but s3cmd is unreliable and very slow. the sync doesn't seem to work because of the timeout error.
I switched to s4cmd it works great, multi threaded and fast.
s4cmd dsync -r -t 1000 --ignore-empty-source forms/ s3://bucket/J/M/
The above uploads a set of files and then throws error - [Thread Failure] Unable to read data from source: /home/ubuntu/path to file The source file contains an image file so there is nothing wrong there.
s4cmd has options like --retry for the command to restart if it fails but this also doesn't seem to work. If you have come across a solution to prevent this error, Please share.
回答1:
I got it working fine. I'm glad my file uploads are super fast. If you are still using s3cmd I highly recommend you switch to s4cmd!
Download and install s4cmd. Find s4cmd.py and replace with the following -
@log_calls
def read_file_chunk(self, source, pos, chunk):
'''Read local file cunks'''
data = None
with open(source, 'rb') as f:
f.seek(pos)
data = f.read(chunk)
if not f:
raise Failure('Unable to read data from source: %s' % source)
return StringIO(data)
then call s4cmd.py into the upload command like
/pathtodir/s4cmd.py dsync -r forms/ s3://bucket/J/M/
来源:https://stackoverflow.com/questions/39519397/issues-with-s4cmd