发表新帖

发表新帖

Using java code to count the number of lines in a file on S3

后端未结

关注

 1  1312

Using java code, is it possible to count the number of lines in a file on AWS s3 without downloading it to local machine.

相关标签:

1条回答

忘掉有多难

2021-01-29 14:26
Depends what you mean by download.

There is no remote processing in S3 - you can't upload code that will execute in the S3 service. Possible alternatives:
- If the issue is that the file is too big to store in memory or on your local disk, you can still download the file in chunks and process each chunk separately. You just use the Java InputStream (or whatever other API you are using) and download a chunk, say 4KB, process it (scan for line endings), and continue without storing to disk. Downside here is that you are still doing all this I/O from S3 to download the file to your machine.
- Use AWS lambda - create a lambda function that does the processing for you. This code runs in the amazon cloud, so no I/O to your machine, only inside the cloud. The function would be the same as the previous option, just runs remotely.
- Use EC2 - If you need more control of your code, custom operating systems, etc, you can have a dedicated VM on ec2 that handles this.
Given the information in your question, I would say that the lambda function is probably the best option.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题