ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/sharder/tasks/task-18-2-4-node-removal
TASK

Implementation

When a node leaves the ring (graceful shutdown or crash), its key range must be taken over by its successor. The two scenarios require different handling.

Graceful shutdown:

  1. Node announces it is leaving
  2. Node transfers all its keys to its clockwise successor(s)
  3. Ring topology is updated
  4. No data loss, minimal disruption

Crash recovery:

  1. Other nodes detect the failure (missed heartbeats)
  2. The successor takes over the key range
  3. Data is recovered from replica copies
  4. New replicas are created to restore the replication factor
Request:  {"type": "ring_remove_node", "msg_id": 1, "node": "n2", "mode": "graceful"}
Response: {"type": "ring_remove_node_ok", "in_reply_to": 1, "keys_migrated": 333, "target_nodes": ["n1", "n3"], "mode": "graceful"}

Sample Test Cases

Graceful removal migrates keysTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1","n2","n3"]}}
{"src":"c1","dest":"n1","body":{"type":"ring_remove_node","msg_id":2,"node":"n2","mode":"graceful"}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Crash recovery takes over key rangeTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1","n2","n3"]}}
{"src":"c1","dest":"n1","body":{"type":"ring_remove_node","msg_id":2,"node":"n3","mode":"crash"}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Hints

Hint 1
On graceful shutdown: node transfers its keys to its clockwise successor before leaving
Hint 2
On crash: the successor detects the failure and takes over the key range
Hint 3
Graceful is faster (pre-transfer), crash requires recovery from replicas
Hint 4
With virtual nodes, keys from the removed vnodes distribute to multiple successors
Hint 5
Replica copies ensure no data loss even on crash
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

node removalgraceful shutdowncrash recoverykey takeoversuccessor promotion
main.py
python
Handle Node Removal with Graceful and Crash Recovery - The Sharder | Build Distributed Systems