Can you upload to S3 using a stream rather than a local file?

前端 未结 5 1479
迷失自我
迷失自我 2020-11-29 23:17

I need to create a CSV and upload it to an S3 bucket. Since I\'m creating the file on the fly, it would be better if I could write it directly to S3 bucket as it is being cr

相关标签:
5条回答
  • 2020-11-29 23:37

    According to docs it's possible

    s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))
    

    so we can use StringIO in ordinary way

    Update: smart_open lib from @inquiring minds answer is better solution

    0 讨论(0)
  • 2020-11-29 23:39

    I did find a solution to my question, which I will post here in case anyone else is interested. I decided to do this as parts in a multipart upload. You can't stream to S3. There is also a package available that changes your streaming file over to a multipart upload which I used: Smart Open.

    import smart_open
    import io
    import csv
    
    testDict = [{
        "fieldA": "8",
        "fieldB": None,
        "fieldC": "888888888888"},
        {
        "fieldA": "9",
        "fieldB": None,
        "fieldC": "99999999999"}]
    
    fieldnames = ['fieldA', 'fieldB', 'fieldC']
    f = io.StringIO()
    with smart_open.smart_open('s3://dev-test/bar/foo.csv', 'wb') as fout:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        fout.write(f.getvalue())
    
        for row in testDict:
            f.seek(0)
            f.truncate(0)
            writer.writerow(row)
            fout.write(f.getvalue())
    
    f.close()
    
    0 讨论(0)
  • 2020-11-29 23:41

    We were trying to upload file contents to s3 when it came through as an InMemoryUploadedFile object in a Django request. We ended up doing the following because we didn't want to save the file locally. Hope it helps:

    @action(detail=False, methods=['post'])
    def upload_document(self, request):
         document = request.data.get('image').file
         s3.upload_fileobj(document, BUCKET_NAME, 
                                     DESIRED_NAME_OF_FILE_IN_S3, 
                                     ExtraArgs={"ServerSideEncryption": "aws:kms"})
    
    0 讨论(0)
  • 2020-11-29 23:41

    There's an interesting code solution mentioned in a GitHub smart_open issue (#82) that I've been meaning to try out. Copy-pasting here for posterity... looks like boto3 is required:

    csv_data = io.BytesIO()
    writer = csv.writer(csv_data)
    writer.writerows(my_data)
    
    gz_stream = io.BytesIO()
    with gzip.GzipFile(fileobj=gz_stream, mode="w") as gz:
        gz.write(csv_data.getvalue())
    gz_stream.seek(0)
    
    s3 = boto3.client('s3')
    s3.upload_fileobj(gz_stream, bucket_name, key)
    

    This specific example is streaming to a compressed S3 key/file, but it seems like the general approach -- using the boto3 S3 client's upload_fileobj() method in conjunction with a target stream, not a file -- should work.

    0 讨论(0)
  • To write a string to an S3 object, use:

    s3.Object('my_bucket', 'my_file.txt').put('Hello there')
    

    So convert the stream to string and you're there.

    0 讨论(0)
提交回复
热议问题