Does anyone have a quick way of getting the line count of a file hosted in S3? Preferably using the CLI, s3api but I am open to python/boto as well. Note: solution must run
Here's two methods that might work for you...
Amazon S3 has a new feature called S3 Select that allows you to query files stored on S3.
You can perform a count of the number of records (lines) in a file and it can even work on GZIP files. Results may vary depending upon your file format.
Amazon Athena is also a similar option that might be suitable. It can query files stored in Amazon S3.