ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/projects/mini-dynamo/tasks/dynamo-t1-s2-node-join
DS

Compute Key Migration on Node Join

Mini-Dynamo / Node Membership Changes
intermediate

Concept

Minimal Key Migration on Node Join

One of the most important properties of consistent hashing is that adding a node to the ring causes only O(K/N) keys to move, where K is the total key count and N is the current node count. Modular hashing requires O(K) moves because changing N scrambles the entire mapping. Consistent hashing confines disruption to a single neighborhood.

BEFORE n1 @200 n2 @600 c@300 d@450 c@300, d@450 owned by n2 JOIN n3@400 AFTER n3 joins @400 n1 @200 n3 @400 n2 @600 c@300 → n3 d@450 stays n2 MOVE c n3 claims range (200..400] only c@300 is in that range

The Claimed Range

When a new node joins at position P, it claims responsibility for the key range:

(predecessor(P), P]

The predecessor is the largest occupied ring position strictly less than P (with wraparound if P is the smallest position). Every key whose position falls in this range was previously owned by the node just clockwise of P — the new node's immediate successor. Those keys, and only those keys, need to migrate.

In the diagram above, n3 joins at 400. Its predecessor is n1 at 200, so its claimed range is (200, 400]. Key c at position 300 falls in this range and moves from n2 (its previous owner) to n3. Key d at position 450 is outside the range and stays with n2. No other nodes are affected at all.

Why Only One Neighbor Is Affected

Keys outside the new node's claimed range had their clockwise successor unchanged by the join. The new node only intercepts traffic that was already heading to its immediate successor. This is the minimal disruption property: the cost of a join is proportional to the size of one node's range — approximately K/N keys — not the full dataset K.

Implementation Steps

  1. Compute the new node's predecessor: the largest ring position strictly less than new_pos. If new_pos is smaller than all existing positions, wrap to the last position in the ring.
  2. Identify the previous owner: perform a lookup of new_pos before inserting the new node. That owner is the node losing keys.
  3. Iterate over all stored keys. If a key's position falls in the range (predecessor_pos, new_pos], it belongs to the new node.
  4. Insert the new node into the ring.
  5. Output sorted MOVE key FROM old TO new lines.
def in_range(key_pos, prev_pos, new_pos):
    if new_pos > prev_pos:
        # normal non-wrapping range
        return prev_pos < key_pos <= new_pos
    else:
        # wraparound: range spans the 0/1000 boundary
        return key_pos > prev_pos or key_pos <= new_pos

Why It Matters in Production

In Amazon's Dynamo, nodes join the ring frequently: capacity is added ahead of traffic spikes, nodes are replaced after hardware failures, and software updates require rolling restarts. Each join must complete without stalling reads or writes to the rest of the cluster. The O(K/N) migration bound means joining a 100-node cluster with a billion keys only moves 10 million keys — and that work can be done incrementally while the new node begins serving requests.

Common Pitfalls

  • Computing predecessor after insertion: always find the predecessor before inserting the new node, or the ring state will be wrong.
  • Wraparound range: when the new node's position is smaller than all existing positions, the predecessor is the last node in the sorted list and the range wraps across the ring boundary.
  • Deterministic output: sort migrations alphabetically by key name so the output is deterministic regardless of dictionary iteration order.
main.py
python

Sign in to run and submit code.