python boto for aws s3, how to get sorted and limited files list in bucket?

孤人 提交于 2019-12-22 01:15:11

问题


If There are too many files on a bucket, and I want to get only 100 newest files, How can I get only these list?

s3.bucket.list seems not to have that function. Is there anybody who know this?

please let me know. thanks.


回答1:


There is no way to do this type of filtering on the service side. The S3 API does not support it. You might be able to accomplish something like this by using prefixes in your object names. For example, if you named all of your objects using a pattern like this:

YYYYMMDD/<objectname>
20140618/foobar (as an example)

you could use the prefix parameter of the ListBucket request in S3 to return only the object that were stored today. In boto, this would look like:

import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
for key in bucket.list(prefix='20140618'):
    # do something with the key object

You would still have to retrieve all of the objects with that prefix and then sort them locally based on their last_modified_date but that would be much easier than listing all of the objects in the bucket and then sorting.

The other option would be to store metadata object the S3 objects in a database like DynamoDB and then query that database to find the objects to retrieve from S3.

You can find out more about hierarchical listing in S3 here




回答2:


Can you try this code. This worked for me.

import boto,operator,time
con = boto.connect_s3()

key_repo = []

bucket = con.get_bucket('<your bucket name>')
bucket_keys = bucket.get_all_keys()

for object in bucket_keys:
    t = (object.key,time.strptime(object.last_modified[:19], "%Y-%m-%dT%H:%M:%S"))
    key_repo.append(t)

key_repo.sort(key=lambda item:item[1], reverse=1)

for key in key_repo[:10]:  #top 10 items in the list
    print key[0], '   ',key[1]

PS : I am beginner to Python so the code might not be optimized. Fell free to edit the answer to provide best code.



来源:https://stackoverflow.com/questions/24282214/python-boto-for-aws-s3-how-to-get-sorted-and-limited-files-list-in-bucket

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!