An exception “The Content-MD5 you specified did not match what we received”

后端 未结 5 1570
谎友^
谎友^ 2021-01-17 17:14

I got an exception, I never got before when testing my application that uploads a file from ec2 to s3. The content is:

Exception in thread \"Thread-1\" com.a         


        
相关标签:
5条回答
  • 2021-01-17 17:33

    I also ran into this error when I was doing something like this:

    InputStream productInputStream = convertImageFileToInputStream(file);
    
    InputStream thumbnailInputStream = generateThumbnail(productInputStream);
    
    String uploadedFileUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productFilename, productInputStream);
    
    String uploadedThumbnailUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productThumbnailFilename, thumbnailInputStream);
    

    The generateThumbnail method was manipulating the productInputStream using a third party library. Because I couldn't modify the third party library, I simply performed the upload first:

    InputStream productInputStream = convertImageFileToInputStream(file);
    
    // do this first... 
    String uploadedFileUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productFilename, productInputStream);
    
    /// and then this...
    InputStream thumbnailInputStream = generateThumbnail(productInputStream);
    
    String uploadedThumbnailUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productThumbnailFilename, thumbnailInputStream);
    

    ... and added this line inside my generateThumbnail method:

    productInputStream.reset();
    
    0 讨论(0)
  • 2021-01-17 17:43

    Another reason for having this issue is to run a code such as this (python)

    with open(filename, 'r') as fd:
         self._bucket1.put_object(Key=key, Body=fd)
         self._bucket2.put_object(Key=key, Body=fd)
    

    In this case the file object (fd) is pointing to the end of the file when it reaches line 3, so we will get the "Content MD5" error, in order to avoid it we will need to point the file reader back to the start position in the file

    with open(filename, 'r') as fd:
         bucket1.put_object(Key=key, Body=fd)
         fd.seek(0)
         bucket2.put_object(Key=key, Body=fd)
    

    This way we won't get the aforementioned Boto error.

    0 讨论(0)
  • 2021-01-17 17:43

    FWIW, I've managed to find a completely different way of triggering this problem, which requires a different solution.

    It turns out that if you decide to assign ObjectMetadata to a PutObjectRequest explicitly, for example to specify a cacheControl setting, or a contentType, then the AWS SDK mutates the ObjectMetadata instance to stash the MD5 that it computes for the put request. This means that if you are putting multiple objects, all of which you think should have the same metadata assigned to them, you still need to create a new ObjectMetadata instance for each and every PutObjectRequest. If you don't do this, then it reuses the MD5 computed from the previous put request and you get the MD5 mismatch error on the second object you try to put.

    So, to be explicit, doing something like this will fail on the second iteration:

    ObjectMetadata metadata = new ObjectMetadata();
    metadata.setContentType("text/html");
    for(Put obj: thingsToPut)
    {
        PutObjectRequest por = 
            new PutObjectRequest(bucketName, obj.s3Key, obj.file);
        por = por.withMetadata(metadata);
        PutObjectResult res = s3.putObject(por);
    }
    

    You need to do it like this:

    for(Put obj: thingsToPut)
    {
        ObjectMetadata metadata = new ObjectMetadata(); // <<-- New ObjectMetadata every time!
        metadata.setContentType("text/html");
        PutObjectRequest por =
            new PutObjectRequest(bucketName, obj.s3Key, obj.file);
        por = por.withMetadata(metadata);
        PutObjectResult res = s3.putObject(por);
    }
    
    0 讨论(0)
  • 2021-01-17 17:44

    I think I have solved my problem. I finally found that some of my files actually changed during the uploading. Because the file is generated by another thread, the uploading and generating is done at the same time. The file can not be generated immediately, and during the generating of a file, it may be uploading at the same time, the file actually changed during the uploading.

    The md5 of file is created at the beginning of uploading by the AmazonS3Client, then the whole file is uploaded to the S3, at this time, the file is different from the file uploaded at beginning, so the md5 actually changed. I modified my program to a single-threading program, and the problem never turned up again.

    0 讨论(0)
  • 2021-01-17 17:57

    I too ran into this problem. How I solved this:

    I have a microservice that processes AWS SQS Messages. Each message would create multiple temporary files that would have to be uploaded to S3.

    The issue was that the temporary files were named with fixed names without any salt added to them.

    So between two messages, it was possible to rewrite the original file that was to be uploaded.

    I fixed it by adding a random salt (this can be a UUID or the current time in millis depending on what you want) to the file names, after which the files were not being over-written and were successfully uploaded to S3.

    0 讨论(0)
提交回复
热议问题