How is cosmosDB RU throughput enforced

前端 未结 2 1603
隐瞒了意图╮
隐瞒了意图╮ 2020-12-22 01:45

I have a cosmosGB gremlin API set up with 400 RU/s. If I have to run a query that needs 800 RUs, does it mean that this query takes 2 sec to execute? If i increase the throu

相关标签:
2条回答
  • 2020-12-22 02:03

    As I explained in a different, but somewhat related answer here, Request Units are allocated on a per-second basis. In the event a given query will cost more than the number of Request Units available in that one-second window:

    • The query will be executed
    • You will now be in "debt" by the overage in Request Units
    • You will be throttled until your "debt" is paid off

    Let's say you had 400 RU/sec, and you executed a query that cost 800 RU. It would complete, but then you'd be in debt for around 2 seconds (400 RU per second, times two seconds). At this point, you wouldn't be throttled anymore.

    The speed in which a query executes does not depend on the number of RU allocated. Whether you had 1,000 RU/second OR 100,000 RU/second, a query would run in the same amount of time (aside from any throttle time preventing the query from running initially). So, aside from throttling, your 800 RU query would run consistently, regardless of RU count.

    0 讨论(0)
  • 2020-12-22 02:13

    A single query is charged a given amount of request units, so it's not quite accurate to say "query needs 800 RU/s". A 1KB doc read is 1 RU, and writing is more expensive starting around 10 RU each. Generally you should avoid any requests that would individually be more than say 50, and that is probably high. In my experience, I try to keep the individual charge for each operation as low as possible, usually under 20-30 for large list queries.

    The upshot is that 400/s is more than enough to at least complete 1 query. It's when you have multiple attempts that combine for overage in the timespan that Cosmos tells you to wait some time before being allowed to succeed again. This is dynamic and based on a more or less black box formula. It's not necessarily a simple division of allowance by charge, and no individual request would be faster or slower based on the limit.

    You can see if you're getting throttled by inspecting the response, or monitor by checking the Azure dashboard metrics.

    0 讨论(0)
提交回复
热议问题