On Linux, we generally use the head/tail commands to preview the contents of a file. It helps in viewing a part of the file (to inspect the format for instance), rather than ope
If you don't want to download the whole file, you can download a portion of it with the --range option specified in the aws s3api
command and after the file portion is downloaded, then run a head
command on that file.
Example:
aws s3api get-object --bucket my_s3_bucket --key s3_folder/file.txt --range bytes=0-1000000 tmp_file.txt && head tmp_file.txt
Explanation:
The aws s3api get-object
downloads a portion of the s3 file from the specified bucket and s3 folder with the a specified size in --range to a specified output file.
The &&
executes the second command only if the first one has succeeded.
The second command prints the 10 first line of the previously created output file.
You can use the range
switch to the older s3api get-object command to bring back the first bytes of a s3 object. (AFAICT s3
doesn't support the switch.)
The pipe \dev\stdout
can be passed as the target filename if you simply want to view the S3 object by piping to head
. Here's an example:
aws s3api get-object --bucket mybucket_name --key path/to/the/file.log --range bytes=0-10000 /dev/stdout | head
Finally, if like me you're dealing with compressed .gz
files, the above technique also works with zless
enabling you to view the head of the decompressed file:
aws s3api get-object --bucket mybucket_name --key path/to/the/file.log.gz --range bytes=0-10000 /dev/stdout | zless
One tip with zless
: if it isn't working try increasing the size of the range.
One easy way to do is :-
aws s3api get-object --bucket bucket_name --key path/to/file.txt --range bytes=0-10000 /path/to/local/t3.txt | cat t3 | head -100
For the gz file , you can do
aws s3api get-object --bucket bucket_name --key path/to/file.gz --range bytes=0-10000 /path/to/local/t3 | zless t3 | head -100
If the data is being less, incerease the amount of bytes required
There is no such capability. You can only retrieve the entire object. You can perform an HTTP HEAD request to view object metadata, but that isn't what you're looking for.
As others have answered, assuming the file is large, use get-object
command with --range bytes=0-1000
to download only part of the file.
example:
aws s3api get-object --profile opsrep --region eu-west-1 --bucket <MY-BUCKET> --key <DIR/MY-FILE.CSV> --range bytes=0-10000 "OUTPUT.csv"
docs
As of 2018 you can now run SELECT Queries in AWS CLI. Use LIMIT 10 to preview the "head" of your file.
example:
aws s3api select-object-content --bucket <MY-BUCKET> --key <DIR/MY-FILE.CSV> --expression "select * from s3object limit 10" --expression-type "SQL" --input-serialization "CSV={}" --output-serialization "CSV={}" "OUTPUT.csv"
docs
Now you can quickly run head OUTPUT.csv
on the small local file
You can specify a byte range when retrieving data from S3 to get the first N bytes, the last N bytes or anything in between. (This is also helpful since it allows you to download files in parallel – just start multiple threads or processes, each of which retrieves part of the total file.)
I don't know which of the various CLI tools support this directly but a range retrieval does what you want.
The AWS CLI tools ("aws s3 cp" to be precise) does not allow you to do range retrieval but s3curl (http://aws.amazon.com/code/128) should do the trick.(So does plain curl, e.g., using the --range parameter but then you would have to do the request signing on your own.)