Uploading Large File to S3 with Ruby Fails with Out of Memory Error, How to Read and Upload in Chunks?

后端 未结 2 1127
面向向阳花
面向向阳花 2021-01-13 09:32

We are uploading various files to S3 via the Ruby AWS SDK (v2) from a Windows machine. We have tested with Ruby 1.9. Our code works fine except when large files are encounte

2条回答
  •  北恋
    北恋 (楼主)
    2021-01-13 10:08

    The v2 AWS SDK for Ruby, aws-sdk gem, supports streaming objects directly over over the network without loading them into memory. Your example requires only a small correction to do this:

    File.open(filepath, 'rb') do |file|
      resp = s3.put_object(
       :bucket => bucket,
       :key => s3key,
       :body => file
      )
    end
    

    This works because it allows the SDK to call #read on the file object passing in a small number of bytes each time. Calling #read on a Ruby IO object, such as a file, without a first argument will read the entire object into memory, returning it as a string. This is what has caused your out-of-memory errors.

    That said, the aws-sdk gem provides another, more useful interface for uploading files to Amazon S3. This alternative interface automatically:

    • Uses multipart APIs for large objects
    • Can use multiple threads to upload parts in parallel, improving upload speed
    • Computes MD5s of data client-side to for service-side data integrity checks.

    A simple example:

    # notice this uses Resource, not Client
    s3 = Aws::S3::Resource.new(
      :access_key_id => accesskeyid,
      :secret_access_key => accesskey,
      :region => region
    )
    
    s3.bucket(bucket).object(s3key).upload_file(filepath)
    

    This is part of the aws-sdk resource interfaces. There are quite a few helpful utilities in here. The Client class only provides basic API functionality.

提交回复
热议问题