Using S3cmd, how do I get the first and last file in a folder?

北慕城南 提交于 2021-01-28 10:01:06

问题


I'm doing some processing on Hive. Usually, the result of this process is a folder (on S3), with multiple files (named with some random letters and numbers, in order) that I can just 'cat' together.

But for reports, I only need the first and the last file in the folder. Now, if the files number in the hundreds, I can simply download it via the web-gui.

But if it's in the thousands, scrolling down is a pain. Not to mention, Amazon loads things on the fly when needed, as opposed to showing it all.

I tried s3cmd get but my experience with that is basic at best. I end up downloading the contents of the entire folder.

As far as I know one can pipe in extra commands, but I'm not sure how to do that.

So, how do I use s3cmd get to download only the last file in a specific folder?

Thanks.


回答1:


I guess this command should work for you,

s3cmd get $(s3cmd ls s3://bucket_name/folder_name/ | tail -1 | awk '{ print $4 }')

tail -1 will pick the last line in folder listing and awk '{ print $4 }' will pick the name of the file(fourth field).

For first file just replace tail -1 with head -1



来源:https://stackoverflow.com/questions/24544577/using-s3cmd-how-do-i-get-the-first-and-last-file-in-a-folder

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!