Reliability of atomic counters in DynamoDB

concurrency counter atomic increment amazon-dynamodb

28,839

Solution 1

DynamoDB gets it's scaling properties by splitting the keys across multiple servers. This is similar to how other distributed databases like Cassandra and HBase scale. While you can increase the throughput on DynamoDB that just moves your data to multiple servers and now each server can handle total concurrent connections / number of servers. Take a look at their FAQ for an explanation on how to achieve max throughput:

Q: Will I always be able to achieve my level of provisioned throughput?

Amazon DynamoDB assumes a relatively random access pattern across all primary keys. You should set up your data model so that your requests result in a fairly even distribution of traffic across primary keys. If you have a highly uneven or skewed access pattern, you may not be able to achieve your level of provisioned throughput.

When storing data, Amazon DynamoDB divides a table into multiple partitions and distributes the data based on the hash key element of the primary key. The provisioned throughput associated with a table is also divided among the partitions; each partition's throughput is managed independently based on the quota allotted to it. There is no sharing of provisioned throughput across partitions. Consequently, a table in Amazon DynamoDB is best able to meet the provisioned throughput levels if the workload is spread fairly uniformly across the hash key values. Distributing requests across hash key values distributes the requests across partitions, which helps achieve your full provisioned throughput level.

If you have an uneven workload pattern across primary keys and are unable to achieve your provisioned throughput level, you may be able to meet your throughput needs by increasing your provisioned throughput level further, which will give more throughput to each partition. However, it is recommended that you considering modifying your request pattern or your data model in order to achieve a relatively random access pattern across primary keys.

This means that having one key that is incremented directly will not scale since that key must live on one server. There are other ways to handle this problem, for example in memory aggregation with a flush increment to DynamoDB (though this can have reliability issues) or a sharded counter where the increments are spread over multiple keys and read back by pulling all keys in the sharded counter (http://whynosql.com/scaling-distributed-counters/).

Solution 2

In addition to gigq's answer about scalability, DynamoDBs atomic increments are not idempotent and therefore are not reliable: If the connection drops after issuing an UpdateItem ADD request, you have no way of knowing if the add was committed or not, so you don't know if you should retry or not.

DynamoDB conditional updates fix this, at the cost of making the system even less scalable, because you have to retry every time two changes to the attribute are attempted simultaneously, even in the absence of an error.

Solution 3

if you are going to write a single dynamo db key, you will suffer from hot partition issue. Hot partition issue starts around 300 TPS per index. So, if you have 5 indexes in table, you may see hot partition issue around 300/5 ~ 60 TPS.

Otherwise, dynamo db is scalable to about 10-40K TPS, depending on your use case.

28,839

Author by

Mark

Updated on July 09, 2020

Comments

Mark almost 4 years

I was considering to use Amazon DynamoDB in my application, and I have a question regarding its atomic counters reliability.

I'm building a distributed application that needs to concurrently, and consistently, increment/decrement a counter stored in a Dynamo's attribute. I was wondering how reliable the Dynamo's atomic counter is in an heavy concurrent environment, where the concurrency level is extremely high (let's say, for example, an average rate of 20k concurrent hits - to get the idea, that would be almost 52 billions increments/decrements per month).

The counter should be super-reliable and never miss a hit. Has somebody tested DynamoDB in such critical environments?

Thanks