ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/projects/mini-dynamo/tasks/dynamo-t1-s1-basic-ring
DS

Build the Consistent Hash Ring

Mini-Dynamo / Ring Fundamentals
intermediate

Concept

Consistent Hashing

In a traditional hash table, the assignment rule is node = hash(key) % N. This is simple but catastrophic when the cluster size N changes: adding one node remaps roughly half the keyspace, causing a massive cache invalidation storm or data migration wave across the entire fleet.

Consistent hashing solves this by mapping both nodes and keys onto a single circular number line — the ring. A key is owned by the first node encountered walking clockwise from the key's position.

clockwise 0 / 1000 n1 @200 owns (600..200] k1@350 → n2@600 n2 @600 owns (200..600] k2@700 → n1@200 (wraps around) 900 400 800 Legend Node Node Key position

A key at position 350 walks clockwise and finds node n2 at 600 — that is its owner. Key k2 at position 700 walks past 1000, wraps to 0, and reaches n1 at 200. When you add a third node at position 400, only keys in the range (200, 400] need to migrate — that is one node's worth of data, roughly K/N keys total, regardless of cluster size.

The Clockwise Successor Rule

The ring is stored as a sorted list of node positions. Finding the owner of a position is a standard binary search for the first value greater than or equal to the query, with a wrap-around for positions beyond the last node:

import bisect

positions = [200, 600]  # sorted ascending
ring = {200: 'n1', 600: 'n2'}

def lookup(p):
    idx = bisect.bisect_left(positions, p)
    if idx == len(positions):
        idx = 0          # wrap around to first node
    return ring[positions[idx]]

This gives O(log N) lookup time where N is the number of nodes. The same logic applies with hundreds of nodes and millions of keys.

Why Consistent Hashing Matters in Dynamo

Amazon Dynamo stores hundreds of millions of objects across thousands of nodes in multiple datacenters. The consistent hash ring enables several critical operational properties:

  • Minimal migration on join: when a new node joins at position P, only the keys in the range (predecessor(P), P] need to move. Every other key stays exactly where it was.
  • Minimal migration on departure: when a node leaves, its keys shift to its clockwise successor. Only one neighbor is affected.
  • Predictable key ownership: any node can independently compute which node owns any key without a central directory. This eliminates a single point of failure.
  • Preference list derivation: Dynamo derives the N-node replication list for a key by walking N steps clockwise from the key's position. No coordination required.

Why It Matters in Practice

Before consistent hashing, adding a server to a distributed cache required a maintenance window because half the cache would miss. With consistent hashing, you can add or remove nodes continuously with zero downtime and minimal data movement. This is why every major distributed database — Cassandra, Riak, Redis Cluster, ScyllaDB — uses consistent hashing as its partitioning scheme.

Common Pitfalls

  • Off-by-one on exact positions: when a key's position exactly matches a node's position, it belongs to that node (inclusive lower bound). Use bisect_left, not bisect_right.
  • Forgetting wraparound: if bisect_left returns an index equal to the list length, the position is beyond all nodes and must wrap to index 0.
  • Hot spots with few nodes: with only one position per node, load distribution is random and uneven. The fix is virtual nodes — multiple positions per physical node — covered in the next task.
main.py
python

Sign in to run and submit code.