I have an amazon s3 bucket that has tens of thousands of filenames in it. What\'s the easiest way to get a text file that lists all the filenames in the bucket?
The below command will get all the file names from your AWS S3 bucket and write into text file in your current directory:
aws s3 ls s3://Bucketdirectory/Subdirectory/ | cat >> FileNames.txt
Use plumbum to wrap the cli and you will have a clear syntax:
import plumbum as pb
folders = pb.local['aws']('s3', 'ls')
I know its old topic, but I'd like to contribute too.
With the newer version of boto3 and python, you can get the files as follow:
import os
import boto3
from botocore.exceptions import ClientError
client = boto3.client('s3')
bucket = client.list_objects(Bucket=BUCKET_NAME)
for content in bucket["Contents"]:
key = content["key"]
Keep in mind that this solution not comprehends pagination.
For more information: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects
First make sure you are on an instance terminal
and you have all access
of S3
in IAM
you are using. For example I used an ec2 instance.
pip3 install awscli
Then Configure aws
aws configure
Then fill outcredantials ex:-
$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json (or just press enter)
Now, See all buckets
aws s3 ls
Store all buckets name
aws s3 ls > output.txt
See all file structure in a bucket
aws s3 ls bucket-name --recursive
Store file structure in each bucket
aws s3 ls bucket-name --recursive > file_Structure.txt
Hope this helps.
Code in python using the awesome "boto" lib. The code returns a list of files in a bucket and also handles exceptions for missing buckets.
import boto
conn = boto.connect_s3( <ACCESS_KEY>, <SECRET_KEY> )
try:
bucket = conn.get_bucket( <BUCKET_NAME>, validate = True )
except boto.exception.S3ResponseError, e:
do_something() # The bucket does not exist, choose how to deal with it or raise the exception
return [ key.name.encode( "utf-8" ) for key in bucket.list() ]
Don't forget to replace the < PLACE_HOLDERS > with your values.
There are couple of ways you can go about it. Using Python
import boto3
sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
s3 = sesssion.resource('s3')
bucketName = 'testbucket133'
bucket = s3.Bucket(bucketName)
for obj in bucket.objects.all():
print(obj.key)
Another way is using AWS cli for it
aws s3 ls s3://{bucketname}
example : aws s3 ls s3://testbucket133