Can you upload to S3 using a stream rather than a local file?

前端未结

关注

 5  1479

I need to create a CSV and upload it to an S3 bucket. Since I\'m creating the file on the fly, it would be better if I could write it directly to S3 bucket as it is being cr

相关标签:

5条回答

孤街浪徒

2020-11-29 23:37
According to docs it's possible
```
s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))
```
so we can use StringIO in ordinary way

Update: smart_open lib from @inquiring minds answer is better solution
0 讨论(0)
发布评论:

提交评论
- 加载中...

深忆病人

2020-11-29 23:39

I did find a solution to my question, which I will post here in case anyone else is interested. I decided to do this as parts in a multipart upload. You can't stream to S3. There is also a package available that changes your streaming file over to a multipart upload which I used: Smart Open.

import smart_open
import io
import csv

testDict = [{
    "fieldA": "8",
    "fieldB": None,
    "fieldC": "888888888888"},
    {
    "fieldA": "9",
    "fieldB": None,
    "fieldC": "99999999999"}]

fieldnames = ['fieldA', 'fieldB', 'fieldC']
f = io.StringIO()
with smart_open.smart_open('s3://dev-test/bar/foo.csv', 'wb') as fout:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    fout.write(f.getvalue())

    for row in testDict:
        f.seek(0)
        f.truncate(0)
        writer.writerow(row)
        fout.write(f.getvalue())

f.close()

0 讨论(0)

渐次进展

2020-11-29 23:41

We were trying to upload file contents to s3 when it came through as an InMemoryUploadedFile object in a Django request. We ended up doing the following because we didn't want to save the file locally. Hope it helps:

@action(detail=False, methods=['post'])
def upload_document(self, request):
     document = request.data.get('image').file
     s3.upload_fileobj(document, BUCKET_NAME, 
                                 DESIRED_NAME_OF_FILE_IN_S3, 
                                 ExtraArgs={"ServerSideEncryption": "aws:kms"})

0 讨论(0)

挽巷

2020-11-29 23:41
There's an interesting code solution mentioned in a GitHub smart_open issue (#82) that I've been meaning to try out. Copy-pasting here for posterity... looks like boto3 is required:
```
csv_data = io.BytesIO()
writer = csv.writer(csv_data)
writer.writerows(my_data)

gz_stream = io.BytesIO()
with gzip.GzipFile(fileobj=gz_stream, mode="w") as gz:
    gz.write(csv_data.getvalue())
gz_stream.seek(0)

s3 = boto3.client('s3')
s3.upload_fileobj(gz_stream, bucket_name, key)
```
This specific example is streaming to a compressed S3 key/file, but it seems like the general approach -- using the boto3 S3 client's upload_fileobj() method in conjunction with a target stream, not a file -- should work.
0 讨论(0)
发布评论:

提交评论
- 加载中...
不要未来只要你来

2020-11-29 23:43
To write a string to an S3 object, use:
```
s3.Object('my_bucket', 'my_file.txt').put('Hello there')
```
So convert the stream to string and you're there.
0 讨论(0)
发布评论:

提交评论
- 加载中...