Subtracks & Tasks
Raft Leader Election
Implement Node States (Leader, Follower, Candidate)
Implement the three states from Raft: Leader, Follower, and Candidate. Each node starts as a Follower. Candidates request votes. Leaders coordinate th...
Add Heartbeat Mechanism
Implement heartbeats from the leader to followers. The leader periodically sends AppendEntries (empty for now) to maintain authority. Followers that d...
Implement Randomized Election Timeout
Add randomized election timeouts. When a follower does not hear from a leader within its timeout, it becomes a candidate. Randomization helps prevent ...
Handle RequestVote RPC
Implement the RequestVote RPC. Candidates request votes from other nodes. Nodes grant their vote if the candidate's term is current and they have not ...
Prevent Split Votes Through Term Management
Handle split votes where no candidate receives a majority. Candidates increment their term and retry. Proper term management ensures the cluster event...
Interview Prep
Common interview questions for Distributed Systems / Infrastructure Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.
Model Answer
When a follower does not hear from a leader within its election timeout (randomized 150-300ms), it increments its term, becomes a candidate, and requests votes. A node grants a vote if: (1) candidate term > its current term, (2) it has not voted in this term yet, (3) the candidate's log is at least as up-to-date as its own. A candidate wins by getting votes from a majority (>N/2). Two leaders are prevented because majorities overlap — any two majorities of an N-node cluster share at least one node, which can only vote once per term.
Model Answer
The minority partition cannot make progress — it cannot form a quorum for any write. The majority partition elects a new leader if needed and continues serving writes. When the partition heals, the old leader (on the minority side) sees messages with the new, higher term and immediately steps down. Its uncommitted entries are overwritten by the new leader's log. Committed entries (replicated to majority) are safe — they will be in the new leader's log because the new leader must have gotten votes from a majority, which overlaps with the quorum that committed those entries.
Model Answer
Kubernetes becomes read-only. Existing workloads keep running (kubelet operates independently), but no new pods can be scheduled, no config changes can be made, and the API server returns 503 for writes. etcd requires a majority of nodes to be available for writes. With 3 etcd nodes, losing 2 = loss of quorum. With 5, you can lose 2. This is why production clusters run 5 etcd nodes. Recovery requires restoring from a backup or rebuilding the etcd cluster, followed by reconciliation of the API server state.
Model Answer
Randomized timeouts break symmetry to prevent split votes. If all nodes had the same timeout, they would all start elections simultaneously, split votes, and no one would win — indefinite livelock. With randomized timeouts (e.g., 150-300ms), the first node to time out starts an election. If it sends vote requests quickly, it wins before others time out. Split votes are possible but increasingly unlikely per round because retries also use new random timeouts.
Model Answer
A log entry at index i is committed when the leader has replicated it to a majority of nodes (including itself). The leader sends AppendEntries RPCs to all followers. When a majority respond with success, the leader advances its commitIndex to i. On the next heartbeat (or the next AppendEntries), followers learn about the new commitIndex and apply the entry to their state machine. Note: a leader cannot commit entries from previous terms by counting replicas — it can only commit them indirectly by committing a new entry from the current term that comes after them (the term commitment rule).
Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.
Common Mistakes
The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.
Why it happens
If all nodes use the same timeout, they all become candidates simultaneously, each voting for themselves, and nobody gets a majority.
The fix
Randomise the election timeout per node per election (e.g. 150-300 ms). The first node to time out becomes a candidate before others and usually wins.
Why it happens
The purpose of the heartbeat is to suppress follower elections. If the timer is not reset on each valid AppendEntries (even an empty one), the follower will time out.
The fix
Reset the election timer whenever you receive a valid RPC from the current leader — AppendEntries, heartbeats, or InstallSnapshot.
Why it happens
Raft's safety guarantee depends on the leader having all committed entries. A follower must refuse to vote for a candidate whose log is behind.
The fix
Before granting a vote, check that `candidate.lastLogTerm > voter.lastLogTerm`, or `candidate.lastLogTerm == voter.lastLogTerm && candidate.lastLogIndex >= voter.lastLogIndex`.
Why it happens
Raft requires these two fields to survive crashes. An in-memory node that restarts forgets it already voted.
The fix
Write `currentTerm` and `votedFor` to durable storage (the Maelstrom `lin-kv` service works as a stand-in) before sending any response that depends on them.
Why it happens
Once you see a term T > currentTerm, you must update currentTerm to T and clear votedFor before doing anything else — including sending a reply.
The fix
Make "update term and step down" the very first thing you do when receiving any RPC with a higher term, before processing the rest of the message.
Comparison Mode
Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.
| Dimension | Raft | Paxos | Bully |
|---|---|---|---|
| Core idea | Randomised timeouts + term-based voting | Two-phase prepare/accept with ballot numbers | Highest-ID node always wins; lower nodes yield |
| Understandability | Designed for understandability | Notoriously hard to implement correctly | Simple to understand |
| Split-vote handling | Randomised timeouts avoid splits | Higher ballot preempts lower | Deterministic — no split possible |
| Election speed | One round-trip per candidate (usually fast) | Two round-trips minimum | O(n²) messages in worst case |
| Stable leader | Yes — leader sends heartbeats | Multi-Paxos adds leader leases | No — new election on every failure |
| Used in | etcd, CockroachDB, TiKV | Google Chubby, Zookeeper (ZAB) | MongoDB (older), academic examples |
Concepts Covered
Prerequisites
It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.