ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/advanced
Tracks/Advanced
10

Advanced

Advanced
Advanced|5 tasks

Explore advanced distributed systems topics: MapReduce for batch processing, distributed hash tables, Byzantine fault tolerance, and real-time stream processing.

Subtracks & Tasks

Interview Prep

Common interview questions for Distributed Systems Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.

Model Answer

Not reliably. Physical clocks drift and NTP corrections can cause non-monotonic updates. Clock skew of tens of milliseconds is normal. For events within the skew window, timestamps do not establish ordering. Use Lamport clocks (logical clocks) if you need causal ordering: if A causally precedes B, then L(A) < L(B). For tracking concurrent events, use vector clocks. For a global total order close to real time, use Hybrid Logical Clocks (max of wall clock and received HLC) as CockroachDB does, or TrueTime as Google Spanner does.

Model Answer

A Lamport clock is a logical counter. Rule: on send, increment and attach. On receive, set to max(local, received) + 1. Guarantee: if A happens-before B, then L(A) < L(B). Does NOT guarantee the converse: if L(A) < L(B), A may or may not have happened before B (they might be concurrent). Lamport clocks establish a total order consistent with causality, but cannot distinguish concurrent events from causally ordered ones.

Model Answer

Options: (1) Vector clocks: each service maintains a vector of counters; causality is fully tracked but vectors are O(N) in size. (2) Hybrid Logical Clocks (HLC): each service tracks max(wallClock, lastSeenHLC). Clock values are bounded to wall time within the system's max clock skew. Events are ordered by HLC; concurrent events have indeterminate order. (3) Distributed tracing with causal metadata (propagate trace IDs and parent span IDs through RPC calls — this is what OpenTelemetry does). Option 2 is the practical choice for most systems; option 1 for systems requiring full causality tracking; option 3 for debugging.

Model Answer

TrueTime is Google's API for the current time, implemented with GPS receivers and atomic clocks in every datacenter. It returns an interval [earliest, latest] where the true current time is guaranteed to be. The uncertainty is typically 1-7ms. Spanner uses it for commit wait: after the leader decides a commit timestamp T, it waits until TrueTime.now().earliest > T before releasing the transaction. This ensures that any subsequent transaction starting anywhere will have a TrueTime range that begins after T, providing external consistency. No other production system has this — CockroachDB approximates it with bounded clock synchronization and conservative uncertainty estimates.

Model Answer

Root cause: physical timestamps from different machines are used to sort log entries, but the machines have clock skew. An event on a fast-clock machine at T may appear before an event on a slow-clock machine at T-5ms even though the slow-clock event causally preceded it. Fix: (1) Use logical timestamps (Lamport/vector clocks) instead of physical time for ordering events in logs. (2) Propagate a monotonic request ID through all service calls and sort by causal chain rather than time. (3) Use a centralized log aggregation system (e.g., Kafka) where producers attach their HLC timestamp, and the aggregator uses the highest received timestamp as a lower bound for ordering.

Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.

Common Mistakes

The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.

Why it happens

NTP-synchronised clocks still drift by tens of milliseconds. Two events that occur "at the same time" on different nodes have no reliable wall-clock ordering.

The fix

Use Lamport timestamps or vector clocks for causal ordering. Only use wall clocks for human-readable logs, not for ordering distributed events.

Why it happens

Spanner's commit-wait depends on the TrueTime uncertainty interval (ε). If you simulate TrueTime with `time.now()` and zero uncertainty, you break the invariant that commit timestamps are truly in the past.

The fix

Model the uncertainty interval explicitly. A transaction's commit timestamp must be greater than `TT.now().latest` — commit, then wait until `TT.now().earliest > commit_ts` before releasing.

Why it happens

A Prepare ACK is a promise only for the specific ballot number used in that Prepare. A higher-ballot Prepare from another proposer supersedes it.

The fix

Acceptors must track the highest ballot they have promised (max_prepared). A Promise is only valid for the exact ballot of the Prepare that triggered it. Any higher ballot invalidates it.

Why it happens

If an acceptor forgets what it accepted in a slot, a new leader may choose a different value for that slot, breaking the "chosen value never changes" invariant.

The fix

Acceptors must persist (ballot, accepted_value) per slot to stable storage before sending the Accepted message. Recovering nodes must replay from persistent state.

Why it happens

Raft requires a leader to commit at least one entry from its own term before committing entries from prior terms. Without committing a no-op, the leader cannot safely advance `commitIndex` for old entries.

The fix

Immediately after election, append and replicate a no-op log entry in the new term. Once it commits, all previous entries are also safe to apply.

Comparison Mode

Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.

DimensionLamport TimestampsVector ClocksTrueTime
What it capturesHappens-before partial order (scalar)Happens-before partial order (full causality)Real-time bounds with uncertainty interval
SizeO(1) — one integerO(n) — one integer per nodeO(1) — earliest/latest interval
Detects concurrencyNo — concurrent events may appear orderedYes — concurrent if neither dominatesNo — uses real-time intervals
Requires special hardwareNoNoYes — GPS + atomic clocks (Google-specific)
Commit-wait overheadNoneNoneYes — must wait out uncertainty interval
Used inMulti-master DBs, event logsDynamo, Riak, CRDTsGoogle Spanner only
Verdict:Lamport clocks for simple total ordering. Vector clocks when you need to detect true concurrency. TrueTime is Google-specific infrastructure — simulate it with bounded uncertainty in academic implementations.
DimensionPaxosRaftEPaxos
Leader requiredNo (any node can propose)Yes (strict leader)No (any replica can commit)
Throughput bottleneckSingle distinguished proposer in Multi-PaxosLeader is the bottleneck for all writesCommutative commands bypass coordination
Latency for WAN2 round-trips2 round-trips to leader1 round-trip for non-conflicting commands
ComplexityHigh (many edge cases)Medium (designed for understandability)Very high (dependency tracking)
Conflict handlingN/A (total order)N/A (total order via leader)Explicit — conflicting commands ordered; commutative ones run in parallel
Used inGoogle Chubby, older ZooKeeperetcd, CockroachDB, TiKVResearch; not yet widely productionised
Verdict:Raft is the pragmatic choice. EPaxos is theoretically superior for geo-distributed writes but far harder to implement and debug.

Concepts Covered

MapReducebatch processingword countDHTChordfinger tableByzantinePBFTf faultsstreamingwindowingexactly-onceCRDTeventual consistencyconflict-free

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.