DynamoDB provisioned Write Capacity Units exceeded too often and unexpectedly

问题

I believe I understand Write/Read capacity units, how they work and are calculated in DynamoDB. Proof of that is that I understand this article thoroughly as well as the aws documentation. That said I'm experiencing an unexpected behavior when writing items to my table.

I have a DynamoDB table with the following settings. Most notably 5 Write/Read Capacity Units

dynamodb table settings overview

I'm putting in this table readings from sensors connected to a Raspberry Pi that I get and send with python2.7 to Dynamo with my script.

This items are less than 1KB for sure. They look like this:

{
    "reading_id": "<current_time>",
    "sensor_id": "<SENSORS_IDS[i]>",
    "humidity": "<humidity>",
    "temperature": "<temperature>"
}

My script iterates over the sensors, reads from one and submits the sensor's reading/item to DynamoDB with table.put_item every 5 seconds. That is, if the read from the sensors was successful otherwise arbitrarily wait 30s.

Now according to my calculations, I'm writing to DynamoDB 1KB item every 5 seconds, which should be fine since my table is setup up with 5WCU = (5items*1KB)/Second write throughput.

So my questions are:

1. How is it that this small load (if happens as I believe is happening) is exceeding my 5 WCU as seen here?:

dynamodb table write capacity units metric

2. I have been operating with this setup without changes for about a year(free tier ends September 30, 2018). How is it that, it began to change a few months ago even before the free tier ended, as seen here?:

dynamodb billing ytd

My only suspect so far is time.sleep() since in the documentation, it says:

time.sleep(secs)

Suspend execution of the current thread for the given number of seconds. The argument may be a floating point number to indicate a more precise sleep time. The actual suspension time may be less than that requested because any caught signal will terminate the sleep() following execution of that signal’s catching routine. Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.

I am not very familiar with python which makes me think that it could be something in my code. That doesn't explain the fact that I was not having this issue earlier in the year though.

Anyone has any idea of the answers to the questions above or where should I investigate this issue further?

Note: I searched Google and other related questions here. None seemed to apply to my case.

Thank you.

回答1:

Perhaps your table has been partitioned unevenly. You might want to read about DynamoDB Partitions and Data Distribution.

回答2:

The graph you are sharing is showing consumption aggregated over a minute. That is, the sum of all capacity consumed over each 60 seconds, for each data point on the graph.

When you provision a table with 5 WCU then that means you can only write up to 5 1KB items each second. Effectively that gives you a total of up to 300 WCU that you could use in each minute.

So, as long as you see data points of 6 or so then that is totally fine.

One thing to notice is the sum of provisioned write throughput (the orange line) is actually not a sum. That seems to be a bug in CloudWatch: it is instead the per-second provisioned throughput.

A minor observation: you are showing 5-6 units each minute, that means you are actually sleeping for closer to 10 seconds, not 5 between writes.

Lastly, with Dynamo you pay for the capacity that you reserve, not what you consume. So as long as your table is not being throttled, even when you go slightly over the provisioned capacity (which Dynamo allows in certain cases) you will not be charged extra.

来源：https://stackoverflow.com/questions/52573768/dynamodb-provisioned-write-capacity-units-exceeded-too-often-and-unexpectedly

标签

amazon-dynamodb