Partitioning AWS Kinesis Firehose data to s3 by payload [duplicate]

随声附和 提交于 2019-12-01 16:22:14

问题


I am using AWS-Kinesis-Firehose to injest data to S3, and consume it afterwards with Athena.

I am trying to analyze events from different games, to avoid Athena explore much data I would like to partition the s3 data using an identifier for each game, so far I did not find a solution, as Firehose receives data from different games.

Does anyone knows how to do it?

Thank you, Javi.


回答1:


You could possibly use Amazon Kinesis Analytics to split incoming Firehose streams into separate output streams based upon some logic, such as Game ID.

It can accept a KinesisFirehoseInput and send data to a KinesisFirehoseOutput.

However, the limits documentation seems to suggest that there can only be 3 output destinations per application, so this would not be sufficient.




回答2:


You could send your traffic to the main FireHose stream - then use a lambda function to split the data to multiple FireHose streams - one for each game that will save the data in a separate folder/bucket



来源:https://stackoverflow.com/questions/45432265/partitioning-aws-kinesis-firehose-data-to-s3-by-payload

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!