Is there a way to stream data directly from python request to minio bucket

前端 未结 2 1788
走了就别回头了
走了就别回头了 2021-01-14 03:40

I am trying to make a GET request to a server to retrieve a tiff image. I then want to stream it directly to MinIO using the put_object method in the MinIO python SDK.

相关标签:
2条回答
  • 2021-01-14 04:08

    Reading documentation on MinIO about put_object, there are examples how to add a new object to the object storage server. Those examples only explain how to add a file.

    This is definition of put_object function:

    put_object(bucket_name, object_name, data, length, content_type='application/octet-stream', metadata=None, progress=None, part_size=510241024)

    We are interested in data parameter. It states:

    Any python object implementing io.RawIOBase.

    RawIOBase is base class for raw binary I/O. It also defines method read.

    If we were to use dir() built-in function to attempt to return a list of valid attributes for r.content, we could then check if read is there:

    'read' in dir(r.content) -> return False

    That's the reason why you get AttributeError: 'bytes' object has no attribute 'read'. It's because type(r.content) is bytes class.


    You can convert r.content into class that inherits from RawIOBase. That is, using io.BytesIO class. To get size of an object in bytes, we could use io.BytesIO(r.content).getbuffer().nbytes.

    So if you want to stream raw bytes of data to your bucket, convert bytes class to io.BytesIO class:

    import io
    import requests
    
    r = requests.get(url_to_download, stream=True)
    raw_img = io.BytesIO(r.content)
    raw_img_size = raw_img.getbuffer().nbytes
    
    Minio_client.put_object("bucket_name", "stream_test.tiff", raw_img, raw_img_size)
    

    NOTE: Examples show reading binary data from file and getting its size by reading st_size attribute from stat_result which is returned by using os.stat() function.

    st_size is equivalent of to io.BytesIO(r.content).getbuffer().nbytes.

    0 讨论(0)
  • 2021-01-14 04:20

    You can stream your file directly into a minio bucket like this:

    import requests
    
    from pathlib import Path
    from urllib.parse import urlparse
    
    from django.conf import settings
    from django.core.files.storage import default_storage
    
    client = default_storage.client
    object_name = Path(urlparse(response.url).path).name
    bucket_name = settings.MINIO_STORAGE_MEDIA_BUCKET_NAME
    
    with requests.get(url_to_download, stream=True) as r:
        content_length = int(r.headers["Content-Length"])
        result = client.put_object(bucket_name, object_name, r.raw, content_length)
    

    Or you can use a django file field directly:

    with requests.get(url_to_download, stream=True) as r:
        # patch the stream to make django-minio-storage belief
        # it's about to read from a legit file
        r.raw.seek = lambda x: 0
        r.raw.size = int(r.headers["Content-Length"])
        model = MyModel()
        model.file.save(object_name, r.raw, save=True)
    

    The RawIOBase hint from Dinko Pehar was really helpful, thanks a lot. But you have to use response.raw not response.content which would download your file immediately and be really inconvenient when trying to store a large video for example.

    0 讨论(0)
提交回复
热议问题