Issues with s4cmd

时光怂恿深爱的人放手 提交于 2019-12-12 05:05:11

问题


I have about 50GB data to upload to S3 bucket but s3cmd is unreliable and very slow. the sync doesn't seem to work because of the timeout error.

I switched to s4cmd it works great, multi threaded and fast.

     s4cmd dsync -r -t 1000 --ignore-empty-source forms/ s3://bucket/J/M/

The above uploads a set of files and then throws error - [Thread Failure] Unable to read data from source: /home/ubuntu/path to file The source file contains an image file so there is nothing wrong there.

s4cmd has options like --retry for the command to restart if it fails but this also doesn't seem to work. If you have come across a solution to prevent this error, Please share.


回答1:


I got it working fine. I'm glad my file uploads are super fast. If you are still using s3cmd I highly recommend you switch to s4cmd!

Download and install s4cmd. Find s4cmd.py and replace with the following -

    @log_calls
  def read_file_chunk(self, source, pos, chunk):
    '''Read local file cunks'''
    data = None
    with open(source, 'rb') as f:
      f.seek(pos)
      data = f.read(chunk)
    if not f:
      raise Failure('Unable to read data from source: %s' % source)
    return StringIO(data)

then call s4cmd.py into the upload command like

/pathtodir/s4cmd.py dsync -r forms/ s3://bucket/J/M/


来源:https://stackoverflow.com/questions/39519397/issues-with-s4cmd

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!