ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/projects/mini-dynamo
Projects/Mini-Dynamo
advanced
3 tracks|11 tasks|20h estimated

Build Mini-Dynamo

Implement Amazon Dynamo from the 2007 paper. Build a fully functional distributed key-value store featuring consistent hashing, quorum replication, sloppy quorums, hinted handoff, vector clock conflict detection, Merkle tree anti-entropy, and read repair.

storagereplicationconsistent-hashingquorumvector-clocksanti-entropydynamo

Amazon Dynamo

Dynamo: Decentralized Key-Value Store with eventual consistency Client PUT key Coordinator Node Consistent Hash Ring A @100 VC:[A:2,B:1] B @250 VC:[A:2,B:2] C @450 VC:[A:1,C:3] D @650 VC:[D:1,E:1] E @850 VC:[E:2,A:1] N=3 replicas anti-entropy sync preferred replicas (N=3) non-preferred nodes coordinator Blue dashed = write path (W=2 of 3). Green dashed = Merkle tree anti-entropy. VC = vector clock.

Published in 2007 at SOSP, Amazon's Dynamo paper defined a new category of distributed storage: highly available, eventually consistent, and always writable — even during network partitions. It powers Amazon's shopping cart and influenced every NoSQL database that followed, from Cassandra to Riak.

What You Will Build

You will implement every major subsystem of Dynamo from scratch, in six progressive stages:

  1. Consistent Hash Ring — position nodes and keys on a circular ring; add virtual nodes for uniform distribution; compute minimal key migrations when the topology changes.
  2. Quorum Replication — implement the NWR model; write to N replicas, succeed on W acknowledgements; read from N, return the highest-versioned value from R responses.
  3. Fault Tolerance — extend writes beyond preferred replicas using sloppy quorums; track hinted handoffs and deliver them on recovery.
  4. Vector Clocks — version every write with a vector clock; detect whether two versions are causally ordered or concurrently written (a conflict).
  5. Anti-Entropy — build Merkle tree bucket hashing to efficiently find which key ranges diverged between replicas; implement read-repair for on-demand healing.

The Core Trade-off

Dynamo sits firmly in the AP quadrant of the CAP theorem: it sacrifices strong consistency in exchange for availability and partition tolerance. Understanding how it does this — and what "eventually consistent" means in practice — is the central lesson of this project.

Prerequisites

You should be comfortable with Python or Go, basic data structures (sorted lists, dictionaries), and the concept of hashing. Prior knowledge of distributed systems is not required — this project builds it.

Tracks

Consistent Hash Ring

0/4 completed

Build the core data structure that distributes keys across nodes. Understand why consistent hashing outperforms modular hashing for dynamic clusters, and how virtual nodes eliminate hot spots.

Ring Fundamentals

Node Membership Changes

Quorum Replication

0/3 completed

Implement the NWR quorum model that lets Dynamo tune its availability and consistency trade-off. Write to N replicas, require W acknowledgements. Read from N, return the highest-versioned answer from R responses.

Quorum Protocol

Fault Tolerance

Conflict Resolution

0/4 completed

Build the versioning and reconciliation layer. Vector clocks track causal history across nodes. Merkle trees find diverged ranges efficiently. Read repair heals stale replicas without a dedicated background process.

Vector Clocks

Anti-Entropy