how to upload chunks of a string longer than 2147483647 bytes?

后端 未结 1 1571
终归单人心
终归单人心 2021-01-11 16:17

I am trying to upload a file around ~5GB size as below but, it throws the error string longer than 2147483647 bytes. It sounds like there is a limit of 2 GB to

相关标签:
1条回答
  • 2021-01-11 16:42

    Your question has been asked on the requests bug tracker; their suggestion is to use streaming upload. If that doesn't work, you might see if a chunk-encoded request works.

    [edit]

    Example based on the original code:

    # Using `with` here will handle closing the file implicitly
    with open(attachment_path, 'rb') as file_to_upload:
        r = requests.put(
            "{base}problems/{pid}/{atype}/{path}".format(
                base=self._baseurl,
                # It's better to use consistent naming; search PEP-8 for standard Python conventions.
                pid=problem_id,
                atype=attachment_type,
                path=urllib.quote(os.path.basename(attachment_path)),
            ),
            headers=headers,
            # Note that you're passing the file object, NOT the contents of the file:
            data=file_to_upload,
            # Hard to say whether this is a good idea with a large file upload
            timeout=300,
        )
    

    I can't guarantee this would run as-is, since I can't realistically test it, but it should be close. The bug tracker comments I linked to also mention that sending multiple headers may cause issues, so if the headers you're specifying are actually necessary, this may not work.

    Regarding chunk encoding: This should be your second choice. Your code was not specifying 'rb' as the mode for open(...), so changing that should probably make the code above work. If not, you could try this.

    def read_in_chunks():
        # If you're going to chunk anyway, doesn't it seem like smaller ones than this would be a good idea?
        chunk_size = 30720 * 30720
    
        # I don't know how correct this is; if it doesn't work as expected, you'll need to debug
        with open(attachment_path, 'rb') as file_object:
            while True:
                data = file_object.read(chunk_size)
                if not data:
                    break
                yield data
    
    
    # Same request as above, just using the function to chunk explicitly; see the `data` param
    r = requests.put(
        "{base}problems/{pid}/{atype}/{path}".format(
            base=self._baseurl,
            pid=problem_id,
            atype=attachment_type,
            path=urllib.quote(os.path.basename(attachment_path)),
        ),
        headers=headers,
        # Call the chunk function here and the request will be chunked as you specify
        data=read_in_chunks(),
        timeout=300,
    )
    
    0 讨论(0)
提交回复
热议问题