Can I customize partitioning in Kinesis Firehose before delivering to S3?

前端 未结 1 1957
挽巷
挽巷 2021-01-02 15:53

I have a Firehose stream that is intended to ingest millions of events from different sources and of different event-types. The stream should deliver all data to one S3 buck

1条回答
  •  借酒劲吻你
    2021-01-02 16:40

    No. You cannot 'partition' based upon event content.

    Some options are:

    • Send to separate Firehose streams
    • Send to a Kinesis Data Stream (instead of Firehose) and write your own custom Lambda function to process and save the data (See: AWS Developer Forums: Athena and Kinesis Firehose)
    • Use Kinesis Analytics to process the message and 'direct' it to different Firehose streams

    If you are going to use the output with Amazon Athena or Amazon EMR, you could also consider converting it into Parquet format, which has much better performance. This would require post-processing of the data in S3 as a batch rather than converting the data as it arrives in a stream.

    0 讨论(0)
提交回复
热议问题