“head” command for aws s3 to view file contents

前端未结

关注

 8  1298

On Linux, we generally use the head/tail commands to preview the contents of a file. It helps in viewing a part of the file (to inspect the format for instance), rather than ope

相关标签:

8条回答

终归单人心

2021-02-06 23:17
If you don't want to download the whole file, you can download a portion of it with the --range option specified in the aws s3api command and after the file portion is downloaded, then run a head command on that file.

Example:
```
aws s3api get-object --bucket my_s3_bucket --key s3_folder/file.txt --range bytes=0-1000000 tmp_file.txt && head tmp_file.txt
```
Explanation:

The aws s3api get-object downloads a portion of the s3 file from the specified bucket and s3 folder with the a specified size in --range to a specified output file. The && executes the second command only if the first one has succeeded. The second command prints the 10 first line of the previously created output file.
0 讨论(0)
发布评论:

提交评论
- 加载中...
有刺的猬

2021-02-06 23:18

You can use the range switch to the older s3api get-object command to bring back the first bytes of a s3 object. (AFAICT s3 doesn't support the switch.)

The pipe \dev\stdout can be passed as the target filename if you simply want to view the S3 object by piping to head. Here's an example:

aws s3api get-object --bucket mybucket_name --key path/to/the/file.log --range bytes=0-10000 /dev/stdout | head

Finally, if like me you're dealing with compressed .gz files, the above technique also works with zless enabling you to view the head of the decompressed file:

aws s3api get-object --bucket mybucket_name --key path/to/the/file.log.gz --range bytes=0-10000 /dev/stdout | zless

One tip with zless: if it isn't working try increasing the size of the range.

0 讨论(0)
发布评论:

提交评论
- 加载中...

终归单人心

2021-02-06 23:18

One easy way to do is :-

aws s3api get-object --bucket bucket_name --key path/to/file.txt  --range bytes=0-10000 /path/to/local/t3.txt | cat t3 | head -100

For the gz file , you can do

aws s3api get-object --bucket bucket_name --key path/to/file.gz  --range bytes=0-10000 /path/to/local/t3 | zless t3 | head -100

If the data is being less, incerease the amount of bytes required

0 讨论(0)

旧时难觅i

2021-02-06 23:19

There is no such capability. You can only retrieve the entire object. You can perform an HTTP HEAD request to view object metadata, but that isn't what you're looking for.

0 讨论(0)
发布评论:

提交评论
- 加载中...
醉酒成梦

2021-02-06 23:22

As others have answered, assuming the file is large, use get-object command with --range bytes=0-1000 to download only part of the file.

example:
aws s3api get-object --profile opsrep --region eu-west-1 --bucket <MY-BUCKET> --key <DIR/MY-FILE.CSV> --range bytes=0-10000 "OUTPUT.csv" docs

As of 2018 you can now run SELECT Queries in AWS CLI. Use LIMIT 10 to preview the "head" of your file.

example:
aws s3api select-object-content --bucket <MY-BUCKET> --key <DIR/MY-FILE.CSV> --expression "select * from s3object limit 10" --expression-type "SQL" --input-serialization "CSV={}" --output-serialization "CSV={}" "OUTPUT.csv" docs

Now you can quickly run head OUTPUT.csv on the small local file

0 讨论(0)
发布评论:

提交评论
- 加载中...
轻奢々

2021-02-06 23:39

You can specify a byte range when retrieving data from S3 to get the first N bytes, the last N bytes or anything in between. (This is also helpful since it allows you to download files in parallel – just start multiple threads or processes, each of which retrieves part of the total file.)

I don't know which of the various CLI tools support this directly but a range retrieval does what you want.

The AWS CLI tools ("aws s3 cp" to be precise) does not allow you to do range retrieval but s3curl (http://aws.amazon.com/code/128) should do the trick.(So does plain curl, e.g., using the --range parameter but then you would have to do the request signing on your own.)

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页