Subtracks & Tasks
Range Sharding
Implement Shard Controller
Build a shard controller that manages shard assignment: 1. Maintain configuration: which replica group owns which shards 2. Support operations: Join ...
Implement Consistent Hashing for Sharding
Use consistent hashing for shard assignment: 1. Place shards on a hash ring 2. Hash each key to a ring position 3. Find the next shard clockwise from...
Handle Configuration Changes
Handle shard configuration changes: 1. Replica groups poll controller for configuration updates 2. Detect when shard assignment changes 3. Start migr...
Implement Data Migration
Implement data migration between replica groups: 1. Source group: stop accepting writes for migrating shard 2. Create snapshot of shard data + client...
Build Complete Sharded Key-Value Store
Build a complete sharded KV store: 1. Client determines shard for key 2. Client routes to replica group owning shard 3. If wrong group, get new confi...
Consistent Hashing
Implement a Consistent Hash Ring
Consistent hashing places nodes and keys on a circular ring, minimizing key redistribution when nodes join or leave. Unlike modulo hashing (`hash(key)...
Add Virtual Nodes for Even Distribution
With few physical nodes, a consistent hash ring has uneven key distribution. Virtual nodes fix this by giving each physical node V positions on the ri...
Handle Node Addition with Minimal Key Migration
When a node joins, it takes over a portion of the key space from its clockwise neighbor. Only the keys that now fall in the new node's range need to m...
Handle Node Removal with Graceful and Crash Recovery
When a node leaves the ring (graceful shutdown or crash), its key range must be taken over by its successor. The two scenarios require different handl...
Implement Rendezvous Hashing (Highest Random Weight)
Rendezvous hashing (Highest Random Weight) is an alternative to consistent hashing. For each key, compute a score for every node, and the highest scor...
Cross-Shard Queries
Implement Scatter-Gather Query Execution
Scatter-gather is a fundamental distributed query execution pattern. The coordinator "scatters" a query to all shards, each shard processes its local ...
Implement Cross-Shard Aggregations
Distributed aggregations require computing partial aggregates on each shard, then merging partial results at the coordinator. Different aggregation fu...
Implement Cross-Shard JOINs
Cross-shard JOINs are expensive because they may require moving data across the network. The optimal strategy depends on how tables are partitioned. ...
Implement Secondary Indexes on Sharded Data
When data is sharded by a primary key (e.g., `user_id`), querying by a secondary key (e.g., `email`) requires a secondary index. There are two main st...
Implement Distributed ORDER BY with LIMIT
Implementing `ORDER BY score DESC LIMIT 10` in a distributed system requires each shard to return its local top results, then the coordinator merges t...
Interview Prep
Common interview questions for Distributed Systems / Backend Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.
Model Answer
Modulo hashing (key % N) remaps ~(N-1)/N of all keys when a node is added/removed — most keys move. Consistent hashing maps both keys and nodes to a ring; each key goes to the next clockwise node. When a node is added/removed, only ~1/N of keys move (just those between the new node and its predecessor). This minimizes disruption to caches and stateful backends during cluster changes. Virtual nodes (multiple ring positions per physical node) improve balance and allow heterogeneous capacity.
Model Answer
Detection: monitor per-shard write throughput and CPU. A hotspot shows significantly higher rates on one shard. Causes: celebrity problem (one key is extremely popular), time-based keys (all writes go to the "today" shard), poor shard key choice. Solutions: (1) Add random suffix to the hot key and scatter writes across N virtual shards, then aggregate on read, (2) Cache the hot key (read cache, write buffer), (3) Use a different shard key that distributes better, (4) Pre-shard by anticipating known hotspots (Twitter does this for celebrity accounts during major events).
Model Answer
Cross-shard transactions require distributed coordination. Options: (1) Two-phase commit (2PC): a coordinator sends PREPARE to both shards, collects votes, then sends COMMIT or ABORT. Problem: coordinator failure blocks participants. (2) Saga pattern: sequence of local transactions with compensating transactions on failure. No atomicity guarantees during execution, but eventual consistency. (3) Google Spanner / CockroachDB: 2PC over Paxos/Raft groups per shard, using TrueTime/HLC for external consistency. Cross-shard transactions are inherently more expensive than single-shard ones — a common design goal is to colocate frequently transacted data on the same shard.
Model Answer
Cassandra uses a consistent hash ring with a configurable partitioner (Murmur3 by default). The partition key is hashed to a position on the ring. With replication factor N, the row is stored on the N consecutive nodes clockwise from that position. When a node fails, reads/writes use hinted handoff (temporarily written to another node, replayed when the failed node recovers) or serve from the remaining replicas. With RF=3, Cassandra can tolerate 1 node failure for all operations and 2 node failures for reads (depending on consistency level).
Model Answer
A shard controller (or metadata service) maintains the mapping of key ranges to physical shard locations. Clients query the controller to find which shard to contact. It handles shard splits (when a shard is too large), merges, and rebalancing when nodes are added/removed. In MongoDB, this is mongos + config servers. In Vitess, it is the vtgate + topology service. The shard controller is a critical single-source-of-truth and must itself be highly available (replicated via Raft/Paxos) and consistent.
Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.
Common Mistakes
The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.
Why it happens
`key % n` changes for nearly all keys when n changes by 1. In a cluster of 10 nodes, removing one node remaps ~90% of keys.
The fix
Use consistent hashing with a virtual node ring. Adding/removing a physical node only remaps the keys on the adjacent arc of the ring, proportional to 1/n of the keyspace.
Why it happens
Pure sharding without replication means each shard is owned by exactly one node. That node going down takes its entire keyspace offline.
The fix
Replicate each shard across R nodes (typically R=3). Reads and writes use quorum (R=2 of 3). This is the approach used by Dynamo and Cassandra.
Why it happens
If the key distribution is skewed (e.g., one key gets 90% of writes), consistent hashing still maps all that traffic to a single virtual node.
The fix
For write-heavy keys, shard at a sub-key level (e.g., by user + random suffix). For read-heavy keys, replicate aggressively and spread reads across replicas.
Why it happens
During a rebalance, the node that receives a request may no longer own the key by the time it processes it.
The fix
When a node receives a request for a key it no longer owns, it should forward to the correct owner (or return a redirect hint) and the client library should retry once.
Why it happens
An in-memory shard map is lost on restart. If nodes independently re-derive the map they may produce different results.
The fix
Store the shard map in a consistent, highly available metadata service (etcd, ZooKeeper, or the Maelstrom `lin-kv` service). All nodes read the same authoritative map.
Comparison Mode
Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.
| Dimension | Range Sharding | Hash Sharding | Consistent Hashing |
|---|---|---|---|
| Key mapping | Key falls within a shard's range (e.g., A-M, N-Z) | hash(key) % n determines the shard | Key maps to position on a ring; nearest clockwise shard owns it |
| Range query support | Excellent — contiguous keys are co-located | None — related keys scatter across shards | None by default |
| Rebalancing on add/remove | Split or merge affected ranges only | Remaps ~all keys (catastrophic) | Remaps only ~1/n of keys |
| Hotspot risk | High — sequential writes concentrate on one shard | Low — hash distributes uniformly | Low with virtual nodes |
| Metadata overhead | Shard map needed (small) | Just the formula | Ring with virtual nodes (larger) |
| Used in | HBase, Bigtable, Spanner | Naive setups, memcached (old) | DynamoDB, Cassandra, Riak |
Concepts Covered
Prerequisites
It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.