Delete all versions of an object in S3 using python?

后端 未结 9 2027
既然无缘
既然无缘 2021-02-05 17:28

I have a versioned bucket and would like to delete the object (and all of its versions) from the bucket. However, when I try to delete the object from the console, S3 simply add

相关标签:
9条回答
  • 2021-02-05 17:42

    The other answers delete objects individually. It is more efficient to use the delete_objects boto3 call and batch process your delete. See the code below for a function which collects all objects and deletes in batches of 1000:

    bucket = 'bucket-name'
    s3_client = boto3.client('s3')
    object_response_paginator = s3_client.get_paginator('list_object_versions')
    
    delete_marker_list = []
    version_list = []
    
    for object_response_itr in object_response_paginator.paginate(Bucket=bucket):
        if 'DeleteMarkers' in object_response_itr:
            for delete_marker in object_response_itr['DeleteMarkers']:
                delete_marker_list.append({'Key': delete_marker['Key'], 'VersionId': delete_marker['VersionId']})
    
        if 'Versions' in object_response_itr:
            for version in object_response_itr['Versions']:
                version_list.append({'Key': version['Key'], 'VersionId': version['VersionId']})
    
    for i in range(0, len(delete_marker_list), 1000):
        response = s3_client.delete_objects(
            Bucket=bucket,
            Delete={
                'Objects': delete_marker_list[i:i+1000],
                'Quiet': True
            }
        )
        print(response)
    
    for i in range(0, len(version_list), 1000):
        response = s3_client.delete_objects(
            Bucket=bucket,
            Delete={
                'Objects': version_list[i:i+1000],
                'Quiet': True
            }
        )
        print(response)
    
    0 讨论(0)
  • 2021-02-05 17:42

    this script will delete all version of all object with prefix -

    s3 = boto3.resource("s3")
    client = boto3.client("s3")
    s3_bucket = s3.Bucket(bucket_name)
    for obj in s3_bucket.objects.filter(Prefix=""):
    
        response = client.list_object_versions(Bucket=bucket_name, Prefix=obj.key)
    
        while "Versions" in response:
            to_delete = [
                {"Key": ver["Key"], "VersionId": ver["VersionId"]}
                for ver in response["Versions"]
            ]
    
            delete = {"Objects": to_delete}
    
            client.delete_objects(Bucket=bucket_name, Delete=delete)
            response = client.list_object_versions(Bucket=bucket_name, Prefix=obj.key)
    
        client.delete_object(Bucket=bucket_name, Key=obj.key)
    
    0 讨论(0)
  • 2021-02-05 17:44

    As a supplement to @jarmod's answer, here is a way I developed a workaround to "hard deleting" an object (with delete markered objects included);

    def get_all_versions(bucket, filename):
        s3 = boto3.client('s3')
        keys = ["Versions", "DeleteMarkers"]
        results = []
        for k in keys:
            response = s3.list_object_versions(Bucket=bucket)[k]
            to_delete = [r["VersionId"] for r in response if r["Key"] == filename]
        results.extend(to_delete)
        return results
    
    bucket = "YOUR BUCKET NAME"
    file = "YOUR FILE"
    
    for version in get_all_versions(bucket, file):
        s3.delete_object(Bucket=bucket, Key=file, VersionId=version)
    
    0 讨论(0)
  • 2021-02-05 17:45

    You can use object_versions.

    def delete_all_versions(bucket_name: str, prefix: str):
        s3 = boto3.resource('s3')
        bucket = s3.Bucket(bucket_name)
        if prefix is None:
            bucket.object_versions.delete()
        else:
            bucket.object_versions.filter(Prefix=prefix).delete()
    
    delete_all_versions("my_bucket", None) # empties the entire bucket
    delete_all_versions("my_bucket", "my_prefix/") # deletes all objects matching the prefix (can be only one if only one matches)
    
    0 讨论(0)
  • 2021-02-05 18:00

    You can delete an object with all of its versions using following code

    session = boto3.Session(aws_access_key_id, aws_secret_access_key)
    
    bucket_name = 'bucket_name'
    object_name = 'object_name'
    
    s3 = session.client('s3')
    
    versions = s3.list_object_versions (Bucket = bucket_name, Prefix = object_name)
    version_list = versions.get('Versions')
    for version in version_list:
        versionId = version.get('VersionId')
        s3.delete_object(Bucket = bucket_name, Key= object_name, VersionId = versionId)
    
    0 讨论(0)
  • 2021-02-05 18:03

    This post was super helpful without this we would have spent tremendous amount of time cleaning up our S3 folders.

    We had a requirement to clean up specific folders only. So I tried the following code and it worked like a charm. Also note that I am iterating through the 10 times to delete more than 1000 objects limit that function has. Feel free to modify the limit as you wish.

    import boto3
    session = boto3.Session(aws_access_key_id='<YOUR ACCESS KEY>',aws_secret_access_key='<YOUR SECRET KEY>')
    
    bucket_name = '<BUCKET NAME>'
    object_name = '<KEY NAME>'
    
    s3 = session.client('s3')
    
    for i in range(10):
       versions = s3.list_object_versions (Bucket = bucket_name, Prefix = object_name)
    #print (versions)
       version_list = versions.get('Versions')
       for version in version_list:
          keyName = version.get('Key')
          versionId = version.get('VersionId')
          print (keyName + ':' + versionId)
          s3.delete_object(Bucket = bucket_name, Key= keyName, VersionId = versionId)
       marker_list = versions.get('DeleteMarkers')
    #print(marker_list)
       for marker in marker_list:
          keyName1 = marker.get('Key')
          versionId1 = marker.get('VersionId')
          print (keyName1 + ':' + versionId1)
          s3.delete_object(Bucket = bucket_name, Key= keyName1, VersionId = versionId1)
    
    0 讨论(0)
提交回复
热议问题