Tracks/The Sharder

The Sharder

Advanced

Advanced|15 tasks

Implement horizontal scaling through sharding. Build shard controllers, consistent hashing, configuration changes, and data migration to create a scalable distributed storage system.

Subtracks & Tasks

Range Sharding

0/5

RA-1

advanced

Implement Shard Controller

Build a shard controller that manages shard assignment: 1. Maintain configuration: which replica group owns which shards 2. Support operations: Join ...

shardingconfigurationcoordination

RA-2

intermediate

Implement Consistent Hashing for Sharding

Use consistent hashing for shard assignment: 1. Place shards on a hash ring 2. Hash each key to a ring position 3. Find the next shard clockwise from...

consistent hashingkey distributionvirtual nodes

RA-3

advanced

Handle Configuration Changes

Handle shard configuration changes: 1. Replica groups poll controller for configuration updates 2. Detect when shard assignment changes 3. Start migr...

configurationcoordinationatomic transition

RA-4

advanced

Implement Data Migration

Implement data migration between replica groups: 1. Source group: stop accepting writes for migrating shard 2. Create snapshot of shard data + client...

migrationdata transferconsistency

RA-5

advanced

Build Complete Sharded Key-Value Store

Build a complete sharded KV store: 1. Client determines shard for key 2. Client routes to replica group owning shard 3. If wrong group, get new confi...

sharded storageroutingend-to-end

Consistent Hashing

0/5

CO-1

intermediate

Implement a Consistent Hash Ring

Consistent hashing places nodes and keys on a circular ring, minimizing key redistribution when nodes join or leave. Unlike modulo hashing (`hash(key)...

consistent hashinghash ringkey ownership+2 more

CO-2

intermediate

Add Virtual Nodes for Even Distribution

With few physical nodes, a consistent hash ring has uneven key distribution. Virtual nodes fix this by giving each physical node V positions on the ri...

virtual nodesvnodeseven distribution+2 more

CO-3

intermediate

Handle Node Addition with Minimal Key Migration

When a node joins, it takes over a portion of the key space from its clockwise neighbor. Only the keys that now fall in the new node's range need to m...

node additionkey migrationminimal disruption+2 more

CO-4

advanced

Handle Node Removal with Graceful and Crash Recovery

When a node leaves the ring (graceful shutdown or crash), its key range must be taken over by its successor. The two scenarios require different handl...

node removalgraceful shutdowncrash recovery+2 more

CO-5

advanced

Implement Rendezvous Hashing (Highest Random Weight)

Rendezvous hashing (Highest Random Weight) is an alternative to consistent hashing. For each key, compute a score for every node, and the highest scor...

rendezvous hashinghighest random weightHRW+2 more

Cross-Shard Queries

0/5

CR-1

intermediate

Implement Scatter-Gather Query Execution

Scatter-gather is a fundamental distributed query execution pattern. The coordinator "scatters" a query to all shards, each shard processes its local ...

scatter-gatherquery coordinatorpartial results+2 more

CR-2

intermediate

Implement Cross-Shard Aggregations

Distributed aggregations require computing partial aggregates on each shard, then merging partial results at the coordinator. Different aggregation fu...

distributed aggregationspartial aggregatesCOUNT+4 more

CR-3

advanced

Implement Cross-Shard JOINs

Cross-shard JOINs are expensive because they may require moving data across the network. The optimal strategy depends on how tables are partitioned. ...

distributed joinshash partitioningco-located joins+3 more

CR-4

advanced

Implement Secondary Indexes on Sharded Data

When data is sharded by a primary key (e.g., `user_id`), querying by a secondary key (e.g., `email`) requires a secondary index. There are two main st...

secondary indexesglobal indexeslocal indexes+3 more

CR-5

intermediate

Implement Distributed ORDER BY with LIMIT

Implementing `ORDER BY score DESC LIMIT 10` in a distributed system requires each shard to return its local top results, then the coordinator merges t...

distributed sortingtop-N querymerge sort+3 more

Interview Prep

Common interview questions for Distributed Systems / Backend Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.

Model Answer

Modulo hashing (key % N) remaps ~(N-1)/N of all keys when a node is added/removed — most keys move. Consistent hashing maps both keys and nodes to a ring; each key goes to the next clockwise node. When a node is added/removed, only ~1/N of keys move (just those between the new node and its predecessor). This minimizes disruption to caches and stateful backends during cluster changes. Virtual nodes (multiple ring positions per physical node) improve balance and allow heterogeneous capacity.

Model Answer

Detection: monitor per-shard write throughput and CPU. A hotspot shows significantly higher rates on one shard. Causes: celebrity problem (one key is extremely popular), time-based keys (all writes go to the "today" shard), poor shard key choice. Solutions: (1) Add random suffix to the hot key and scatter writes across N virtual shards, then aggregate on read, (2) Cache the hot key (read cache, write buffer), (3) Use a different shard key that distributes better, (4) Pre-shard by anticipating known hotspots (Twitter does this for celebrity accounts during major events).

Model Answer

Cross-shard transactions require distributed coordination. Options: (1) Two-phase commit (2PC): a coordinator sends PREPARE to both shards, collects votes, then sends COMMIT or ABORT. Problem: coordinator failure blocks participants. (2) Saga pattern: sequence of local transactions with compensating transactions on failure. No atomicity guarantees during execution, but eventual consistency. (3) Google Spanner / CockroachDB: 2PC over Paxos/Raft groups per shard, using TrueTime/HLC for external consistency. Cross-shard transactions are inherently more expensive than single-shard ones — a common design goal is to colocate frequently transacted data on the same shard.

Model Answer

Cassandra uses a consistent hash ring with a configurable partitioner (Murmur3 by default). The partition key is hashed to a position on the ring. With replication factor N, the row is stored on the N consecutive nodes clockwise from that position. When a node fails, reads/writes use hinted handoff (temporarily written to another node, replayed when the failed node recovers) or serve from the remaining replicas. With RF=3, Cassandra can tolerate 1 node failure for all operations and 2 node failures for reads (depending on consistency level).

Model Answer

A shard controller (or metadata service) maintains the mapping of key ranges to physical shard locations. Clients query the controller to find which shard to contact. It handles shard splits (when a shard is too large), merges, and rebalancing when nodes are added/removed. In MongoDB, this is mongos + config servers. In Vitess, it is the vtgate + topology service. The shard controller is a critical single-source-of-truth and must itself be highly available (replicated via Raft/Paxos) and consistent.

Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.

Common Mistakes

The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.

Why it happens

`key % n` changes for nearly all keys when n changes by 1. In a cluster of 10 nodes, removing one node remaps ~90% of keys.

The fix

Use consistent hashing with a virtual node ring. Adding/removing a physical node only remaps the keys on the adjacent arc of the ring, proportional to 1/n of the keyspace.

Why it happens

Pure sharding without replication means each shard is owned by exactly one node. That node going down takes its entire keyspace offline.

The fix

Replicate each shard across R nodes (typically R=3). Reads and writes use quorum (R=2 of 3). This is the approach used by Dynamo and Cassandra.

Why it happens

If the key distribution is skewed (e.g., one key gets 90% of writes), consistent hashing still maps all that traffic to a single virtual node.

The fix

For write-heavy keys, shard at a sub-key level (e.g., by user + random suffix). For read-heavy keys, replicate aggressively and spread reads across replicas.

Why it happens

During a rebalance, the node that receives a request may no longer own the key by the time it processes it.

The fix

When a node receives a request for a key it no longer owns, it should forward to the correct owner (or return a redirect hint) and the client library should retry once.

Why it happens

An in-memory shard map is lost on restart. If nodes independently re-derive the map they may produce different results.

The fix

Store the shard map in a consistent, highly available metadata service (etcd, ZooKeeper, or the Maelstrom `lin-kv` service). All nodes read the same authoritative map.

Comparison Mode

Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.

Dimension	Range Sharding	Hash Sharding	Consistent Hashing
Key mapping	Key falls within a shard's range (e.g., A-M, N-Z)	hash(key) % n determines the shard	Key maps to position on a ring; nearest clockwise shard owns it
Range query support	Excellent — contiguous keys are co-located	None — related keys scatter across shards	None by default
Rebalancing on add/remove	Split or merge affected ranges only	Remaps ~all keys (catastrophic)	Remaps only ~1/n of keys
Hotspot risk	High — sequential writes concentrate on one shard	Low — hash distributes uniformly	Low with virtual nodes
Metadata overhead	Shard map needed (small)	Just the formula	Ring with virtual nodes (larger)
Used in	HBase, Bigtable, Spanner	Naive setups, memcached (old)	DynamoDB, Cassandra, Riak

Verdict:Use range sharding when range queries are critical. Consistent hashing for key-value stores that need smooth scaling. Avoid plain modulo hashing in production.

Concepts Covered

shardingconfigurationcoordinationconsistent hashingkey distributionvirtual nodesatomic transitionmigrationdata transferconsistencysharded storageroutingend-to-endhash ringkey ownershipclockwise lookupminimal disruptionvnodeseven distributionload balancinghash collisionnode additionkey migrationpredecessor takeovernode removalgraceful shutdowncrash recoverykey takeoversuccessor promotionrendezvous hashinghighest random weightHRWconsistent hashing alternativeweighted nodesscatter-gatherquery coordinatorpartial resultstimeout handlingfault tolerancedistributed aggregationspartial aggregatesCOUNTSUMAVGmerge functionsalgebraic propertiesdistributed joinshash partitioningco-located joinsshuffle joinsjoin reorderingnetwork overheadsecondary indexesglobal indexeslocal indexesindex shardingwrite amplificationdistributed sortingtop-N querymerge sorttie handlingpaginationconsistent ordering

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.