Published on

Amazon Dynamo DB Paper

Authors

Introduction

Amazon Dynamo DB is one of the most popular NoSQL database out there.

Amazon Dynamo DB is known for providing consistent performance at any scale

In 2021, during 66 hour prime day scale

  • Trillions of calls were made to DynamoDB
  • It peaked at 89.2M RPS - request per seconds
  • High availability with Single Digit Millisecond performance i.e less than or equal to 9ms

Almost all major internal services @ amazon uses Dynamo DB along with external customers

Goals of Dynamo DB Design

DynamoDB was conceptualized with the following in mind

  • provide consistent performance at any scale
  • low single-digit millisecond latency

Workload Pattern

  • Multi-tenant - Load of one customer should not affect another
  • High resource utilization
  • Boundless scale of tables
  • Predictable performance(be it MB or TB)
  • Highly available(replication and recovery)
  • flexible usecase support (schemaless)

Architecture

Dynamo DB is a collection of items

Each item is uniquely identified by its primary key

Primary key needs to be specified during table creation.Primary key is mad up of

  • Partition Key(required) - Hash Key. A unique identifier for each item in the table. DynamoDB uses this key to distribute data across partitions.
  • Sort key(optionoal)- Range Key: If you use a composite primary key, you define a sort key along with the partition key. This allows multiple items with the same partition key but different sort keys to be stored.

It is okay to not have a Sort key. In this case Partition key becomes the Primary Key

Dynamo DB also supports secondary indexes.

Secondary Indexes

  • Global Secondary Index (GSI): Adds another way to query data across the table, using different keys from the primary key.
  • Local Secondary Index (LSI): Allows querying by sort key in combination with the partition key.

DynamoDB Tables are divided into partitions. Each partition is disjoint, susbset and holds contiguous key-range

Each partition has multiple replicas, distributed across multiple availability zones - High availability. dynamo-replicas

The replicas of a partition forms a replication group. One of the replica is a leader and others are pure replica's or follower in that sense. Multi-paxos algorithm is used for consensus and leader election among the replicas

Any replica can trigger the election. When a leader is elected, it can continue to extend its leadership lease. The leader election is at replica level and not at data node level. You can find leader election diagram

dynamo-leader-election

Responsibility of the Replica Leader

  • serves writes
  • serves strongly consistent reads, you have the option to specify the consistency type during read operations. The Leader will handle this

Dynamo DB supports strongly and eventually consistent reads

Strongly consistent reads goes to the replica leader, while the eventually consistent reads goes to any replica

Also reads can be scaled when consistency is relaxed

When Write Comes to the leader

dynamo-leader-write

When the write request comes to the leader, it

  • Makes an entry to its WAL file
  • Sends the update to its peers(handling the same partition)
  • Peer writes to their WAL file
  • Once majority ACK, respond ok. Basically when a Quorum of N is reached

The WAL file contains the list of updates and delete operations