How do I delete a versioned bucket in AWS S3 using the CLI?

淺唱寂寞╮ 提交于 2019-12-02 16:36:19

One way to do it is iterate through the versions and delete them. A bit tricky on the CLI, but as you mentioned Java, that would be more straightforward:

AmazonS3Client s3 = new AmazonS3Client();
String bucketName = "deleteversions-"+UUID.randomUUID();

//Creates Bucket
s3.createBucket(bucketName);

//Enable Versioning
BucketVersioningConfiguration configuration = new BucketVersioningConfiguration(ENABLED);
s3.setBucketVersioningConfiguration(new SetBucketVersioningConfigurationRequest(bucketName, configuration ));

//Puts versions
s3.putObject(bucketName, "some-key",new ByteArrayInputStream("some-bytes".getBytes()), null);
s3.putObject(bucketName, "some-key",new ByteArrayInputStream("other-bytes".getBytes()), null);

//Removes all versions
for ( S3VersionSummary version : S3Versions.inBucket(s3, bucketName) ) {
    String key = version.getKey();
    String versionId = version.getVersionId();          
    s3.deleteVersion(bucketName, key, versionId);
}

//Removes the bucket
s3.deleteBucket(bucketName);
System.out.println("Done!");

You can also batch delete calls for efficiency if needed.

Abe Voelker

I ran into the same limitation of the AWS CLI. I found the easiest solution to be to use Python and boto3:

#!/usr/bin/env python

BUCKET = 'your-bucket-here'

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET)
bucket.object_versions.delete()

# if you want to delete the now-empty bucket as well, uncomment this line:
#bucket.delete()

A previous version of this answer used boto but that solution had performance issues with large numbers of keys as Chuckles pointed out.

Using boto3 it's even easier than with the proposed boto solution to delete all object versions in an S3 bucket:

#!/usr/bin/env python
import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('your-bucket-name')
bucket.object_versions.all().delete()

Works fine also for very large amounts of object versions, although it might take some time in that case.

Cheers

You can delete all the objects in the versioned s3 bucket. But I don't know how to delete specific objects.

$ aws s3api delete-objects \
      --bucket <value> \
      --delete "$(aws s3api list-object-versions \
      --bucket <value> | \
      jq '{Objects: [.Versions[] | {Key:.Key, VersionId : .VersionId}], Quiet: false}')"

Alternatively without jq:

$ aws s3api delete-objects \
    --bucket ${bucket_name} \
    --delete "$(aws s3api list-object-versions \
    --bucket "${bucket_name}" \
    --output=json \
    --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}')"
Nitin

Here is a one liner you can just cut and paste into the command line to delete all versions and delete markers (it requires aws tools, replace yourbucket-name-backup with your bucket name)

echo '#!/bin/bash' > deleteBucketScript.sh \
&& aws --output text s3api list-object-versions --bucket $BUCKET_TO_PERGE \
| grep -E "^VERSIONS" |\
awk '{print "aws s3api delete-object --bucket $BUCKET_TO_PERGE --key "$4" --version-id "$8";"}' >> \
deleteBucketScript.sh && . deleteBucketScript.sh; rm -f deleteBucketScript.sh; echo '#!/bin/bash' > \
deleteBucketScript.sh && aws --output text s3api list-object-versions --bucket $BUCKET_TO_PERGE \
| grep -E "^DELETEMARKERS" | grep -v "null" \
| awk '{print "aws s3api delete-object --bucket $BUCKET_TO_PERGE --key "$3" --version-id "$5";"}' >> \
deleteBucketScript.sh && . deleteBucketScript.sh; rm -f deleteBucketScript.sh;

then you could use:

aws s3 rb s3://bucket-name --force

chuckwired

I ran into issues with Abe's solution as the list_buckets generator is used to create a massive list called all_keys and I spent an hour without it ever completing. This tweak seems to work better for me, I had close to a million objects in my bucket and counting!

import boto

s3 = boto.connect_s3()
bucket = s3.get_bucket("your-bucket-name-here")

chunk_counter = 0 #this is simply a nice to have
keys = []
for key in bucket.list_versions():
    keys.append(key)
    if len(keys) > 1000:
        bucket.delete_keys(keys)
        chunk_counter += 1
        keys = []
        print("Another 1000 done.... {n} chunks so far".format(n=chunk_counter))

#bucket.delete() #as per usual uncomment if you're sure!

Hopefully this helps anyone else encountering this S3 nightmare!

Tiger peng
  1. For deleting specify object(s), using jq filter.
  2. You may need cleanup the 'DeleteMarkers' not just 'Versions'.
  3. Using $() instead of ``, you may embed variables for bucket-name and key-value.
aws s3api delete-objects --bucket bucket-name --delete "$(aws s3api list-object-versions --bucket bucket-name | jq -M '{Objects: [.["Versions","DeleteMarkers"][]|select(.Key == "key-value")| {Key:.Key, VersionId : .VersionId}], Quiet: false}')"

By far the easiest method I've found is to use this CLI tool, s3wipe. It's provided as a docker container so you can use it like so:

$ docker run -it --rm slmingol/s3wipe --help
usage: s3wipe [-h] --path PATH [--id ID] [--key KEY] [--dryrun] [--quiet]
              [--batchsize BATCHSIZE] [--maxqueue MAXQUEUE]
              [--maxthreads MAXTHREADS] [--delbucket] [--region REGION]

Recursively delete all keys in an S3 path

optional arguments:
  -h, --help               show this help message and exit
  --path PATH              S3 path to delete (e.g. s3://bucket/path)
  --id ID                  Your AWS access key ID
  --key KEY                Your AWS secret access key
  --dryrun                 Don't delete. Print what we would have deleted
  --quiet                  Suprress all non-error output
  --batchsize BATCHSIZE    # of keys to batch delete (default 100)
  --maxqueue MAXQUEUE      Max size of deletion queue (default 10k)
  --maxthreads MAXTHREADS  Max number of threads (default 100)
  --delbucket              If S3 path is a bucket path, delete the bucket also
  --region REGION          Region of target S3 bucket. Default vaue `us-
                           east-1`

Example

Here's an example where I'm deleting all the versioned objects in a bucket and then deleting the bucket:

$ docker run -it --rm slmingol/s3wipe \
   --id $(aws configure get default.aws_access_key_id) \
   --key $(aws configure get default.aws_secret_access_key) \
   --path s3://bw-tf-backends-aws-example-logs \
   --delbucket
[2019-02-20@03:39:16] INFO: Deleting from bucket: bw-tf-backends-aws-example-logs, path: None
[2019-02-20@03:39:16] INFO: Getting subdirs to feed to list threads
[2019-02-20@03:39:18] INFO: Done deleting keys
[2019-02-20@03:39:18] INFO: Bucket is empty.  Attempting to remove bucket

How it works

There's a bit to unpack here but the above is doing the following:

  • docker run -it --rm mikelorant/s3wipe - runs s3wipe container interactively and deletes it after each execution
  • --id & --key - passing our access key and access id in
  • aws configure get default.aws_access_key_id - retrieves our key id
  • aws configure get default.aws_secret_access_key - retrieves our key secret
  • --path s3://bw-tf-backends-aws-example-logs - bucket that we want to delete
  • --delbucket - deletes bucket once emptied

References

https://gist.github.com/wknapik/191619bfa650b8572115cd07197f3baf

#!/usr/bin/env bash

set -eEo pipefail
shopt -s inherit_errexit >/dev/null 2>&1 || true

if [[ ! "$#" -eq 2 || "$1" != --bucket ]]; then
    echo -e "USAGE: $(basename "$0") --bucket <bucket>"
    exit 2
fi

# $@ := bucket_name
empty_bucket() {
    local -r bucket="${1:?}"
    for object_type in Versions DeleteMarkers; do
        local opt=() next_token=""
        while [[ "$next_token" != null ]]; do
            page="$(aws s3api list-object-versions --bucket "$bucket" --output json --max-items 1000 "${opt[@]}" \
                        --query="[{Objects: ${object_type}[].{Key:Key, VersionId:VersionId}}, NextToken]")"
            objects="$(jq -r '.[0]' <<<"$page")"
            next_token="$(jq -r '.[1]' <<<"$page")"
            case "$(jq -r .Objects <<<"$objects")" in
                '[]'|null) break;;
                *) opt=(--starting-token "$next_token")
                   aws s3api delete-objects --bucket "$bucket" --delete "$objects";;
            esac
        done
    done
}

empty_bucket "${2#s3://}"

E.g. empty_bucket.sh --bucket foo

This will delete all object versions and delete markers in a bucket in batches of 1000. Afterwards, the bucket can be deleted with aws s3 rb s3://foo.

Requires bash, awscli and jq.

This bash script found here: https://gist.github.com/weavenet/f40b09847ac17dd99d16

worked as is for me.

I saved script as: delete_all_versions.sh and then simply ran:

./delete_all_versions.sh my_foobar_bucket

and that worked without a flaw.

Did not need python or boto or anything.

This works for me. Maybe running later versions of something and above > 1000 items. been running a couple of million files now. However its still not finished after half a day and no means to validate in AWS GUI =/

# Set bucket name to clearout
BUCKET = 'bucket-to-clear'

import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET)

max_len         = 1000      # max 1000 items at one req
chunk_counter   = 0         # just to keep track
keys            = []        # collect to delete

# clear files
def clearout():
    global bucket
    global chunk_counter
    global keys
    result = bucket.delete_objects(Delete=dict(Objects=keys))

    if result["ResponseMetadata"]["HTTPStatusCode"] != 200:
        print("Issue with response")
        print(result)

    chunk_counter += 1
    keys = []
    print(". {n} chunks so far".format(n=chunk_counter))
    return

# start
for key in bucket.object_versions.all():
    item = {'Key': key.object_key, 'VersionId': key.id}
    keys.append(item)
    if len(keys) >= max_len:
        clearout()

# make sure last files are cleared as well
if len(keys) > 0:
    clearout()

print("")
print("Done, {n} items deleted".format(n=chunk_counter*max_len))
#bucket.delete() #as per usual uncomment if you're sure!

For those using multiple profiles via ~/.aws/config

import boto3

PROFILE = "my_profile"
BUCKET = "my_bucket"

session = boto3.Session(profile_name = PROFILE)
s3 = session.resource('s3')
bucket = s3.Bucket(BUCKET)
bucket.object_versions.delete()

I found the other answers either incomplete or requiring external dependencies to be installed (like boto), so here is one that is inspired by those but goes a little deeper.

As documented in Working with Delete Markers, before a versioned bucket can be removed, all its versions must be completely deleted, which is a 2-step process:

  1. "delete" all version objects in the bucket, which marks them as deleted but does not actually delete them
  2. complete the deletion by deleting all the deletion marker objects

Here is the pure CLI solution that worked for me (inspired by the other answers):

#!/usr/bin/env bash

bucket_name=...

del_s3_bucket_obj()
{
    local bucket_name=$1
    local obj_type=$2
    local query="{Objects: $obj_type[].{Key:Key,VersionId:VersionId}}"
    local s3_objects=$(aws s3api list-object-versions --bucket ${bucket_name} --output=json --query="$query")
    if ! (echo $s3_objects | grep -q '"Objects": null'); then
        aws s3api delete-objects --bucket "${bucket_name}" --delete "$s3_objects"
    fi
}

del_s3_bucket_obj ${bucket_name} 'Versions'
del_s3_bucket_obj ${bucket_name} 'DeleteMarkers'

Once this is done, the following will work:

aws s3 rb "s3://${bucket_name}"

Not sure how it will fare with 1000+ objects though, if anyone can report that would be awesome.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!