Reading AWS Dynamodb Stream

寵の児 提交于 2019-12-25 07:37:11

问题


I want to do an incremental DynamoDB backup on S3 using DynamoDB Streams. I have a lambda that reads the dynamodb stream and writes files into S3. In order to mark already read shards I have ExclusiveStartShardId logged into configuration file.

What I do is:

  1. Describe the stream (using the logged ExclusiveStartShardId)
  2. Get stream's shards
  3. For all shards that are CLOSED (has EndingSequenceNumber) I do the following:
    • Get shard iterator for the certain shard (shardIteratorType: 'TRIM_HORIZON')
    • Iterate through shard and fetch records till NextShardIterator becomes null

The problem here is that I read only closed shards and in order to get new records I must wait (undetermined-amount-of-time) for it to be closed.

It seems that the last shard is usually in OPEN state (has NO EndingSequenceNumber). If I remove the check for EndingSequenceNumber from the pseudo code above I end up with infinite loop because when I hit the last shard NextShardIterator is always presented. I cannot also do a check if fetched items are 0 because there could be "gaps" in the shard.

In this tutorial numChanges is used in order to stop the infinite loop http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.LowLevel.Walkthrough.html#Streams.LowLevel.Walkthrough.Step5

What is the best approach in this situation?

I also found a similar question: Reading data from dynamodb streams. Unfortunately I could not find the answer for my question.


回答1:


Why not attach the DynamoDB stream as an event source for your Lambda function? Then Lambda will take care of polling the stream and calling your function when necessary. See this for details.



来源:https://stackoverflow.com/questions/37814516/reading-aws-dynamodb-stream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!