s3 - how to get fast line count of file? wc -l is too slow

前端 未结 2 461

Does anyone have a quick way of getting the line count of a file hosted in S3? Preferably using the CLI, s3api but I am open to python/boto as well. Note: solution must run

2条回答
  •  悲&欢浪女
    2021-01-18 15:53

    Here's two methods that might work for you...

    Amazon S3 has a new feature called S3 Select that allows you to query files stored on S3.

    You can perform a count of the number of records (lines) in a file and it can even work on GZIP files. Results may vary depending upon your file format.

    S3 Select

    Amazon Athena is also a similar option that might be suitable. It can query files stored in Amazon S3.

提交回复
热议问题