ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/coordinator
Tracks/The Coordinator
09

The Coordinator

Advanced
Advanced|15 tasks

Implement distributed transactions with two-phase commit, three-phase commit, and saga patterns.

Subtracks & Tasks

Interview Prep

Common interview questions for Distributed Systems Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.

Model Answer

In 2PC, a coordinator sends PREPARE to all participants; each votes YES (durably logs the tentative commit) or NO. If all vote YES, coordinator sends COMMIT; otherwise ABORT. Blocking problem: if the coordinator crashes after collecting YES votes but before sending COMMIT, participants are stuck holding locks indefinitely — they cannot unilaterally abort (might conflict with a commit the coordinator decided). Production solutions: (1) Replicate the coordinator using Raft so another node takes over, (2) Store coordinator decisions in a shared store participants can read (Percolator/Spanner approach), (3) Use timeouts with careful retry logic and human intervention for stuck transactions.

Model Answer

A distributed transaction (2PC-based) provides atomicity: all participants commit or none do. Intermediate states are not visible. Requires coordination and can block on failures. A saga is a sequence of local transactions, each with a compensating transaction. If step k fails, compensating transactions for steps 1..k-1 are executed. Intermediate states are visible. No coordinator blocking — each step acknowledges independently. Sagas are preferred in microservices architectures because they avoid distributed locking and work across services with different storage systems. Tradeoff: sagas do not provide isolation — concurrent sagas can interfere with each other.

Model Answer

Options by consistency requirement: (1) 2PC with a coordinator (strong atomicity, blocking on coordinator failure), (2) Saga: debit A, then credit B; if credit fails, execute compensating transaction to re-credit A (eventual consistency, intermediate state visible), (3) Outbox pattern: write debit and a "pending credit" event to the same DB transaction on A's side; a separate process reads the outbox and applies the credit to B, retrying until success (at-least-once delivery). For financial systems, option 3 with idempotency keys is common because it avoids distributed coordination while ensuring eventual consistency.

Model Answer

XA is a standard interface for distributed transactions, supported by most relational databases (Postgres, MySQL, Oracle) and message brokers. An XA transaction coordinator manages 2PC across multiple XA-capable resources. It is still used in enterprise Java applications (JTA/JTS) and mainframe-style architectures. It is generally avoided in modern distributed systems because: high latency (multiple round-trips per transaction), coordinator as SPOF, poor performance at scale. Cloud-native systems prefer sagas or application-level 2PC over a Raft-replicated coordinator.

Model Answer

Spanner uses 2PC where each shard is a Paxos group (not a single node). The 2PC coordinator is a Paxos leader (highly available, no single point of failure for blocking). TrueTime provides bounded clock uncertainty — Spanner assigns commit timestamps that are guaranteed to be after all prior transactions by waiting out the TrueTime uncertainty interval (commit wait). This ensures any transaction that starts after commit T will see the committed write, providing external consistency without a global sequencer. CockroachDB replicates this using Raft per range and Hybrid Logical Clocks instead of TrueTime.

Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.

Common Mistakes

The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.

Why it happens

If the coordinator crashes after sending some Prepare messages but before collecting all votes, no one knows the global outcome. Participants are blocked.

The fix

Log the transaction and its state (PREPARED, COMMITTED, ABORTED) to durable storage before sending Phase 2 messages. On recovery, replay from the log.

Why it happens

2PC requires locks to be held from Phase 1 until Phase 2 completes. Network latency turns lock hold time from microseconds to hundreds of milliseconds.

The fix

Minimise the Prepare phase — do all local validation quickly and send the Phase 2 abort/commit message as fast as possible. Use read-committed isolation when strict serializability is not required.

Why it happens

Without a timeout, the coordinator waits forever for a Prepare or Commit ACK that will never arrive.

The fix

Set a deadline for each RPC. On timeout, assume the participant failed and abort the transaction. Log the abort so a recovered participant can query the outcome.

Why it happens

If a participant has no record of the transaction (it was never delivered the Prepare), it must abort — not commit.

The fix

Participants must only respond PREPARED if they successfully executed and persisted the Prepare phase locally. An unknown transaction ID always yields ABORT.

Why it happens

Transaction IDs generated with a simple counter can collide if the coordinator restarts and reuses IDs from before the crash.

The fix

Generate globally unique transaction IDs (UUIDs or a coordinator-epoch + sequence). Include the coordinator epoch so post-crash IDs are always fresh.

Comparison Mode

Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.

Dimension2PCSagaTCC
AtomicityAtomic — all commit or all abortNot atomic — uses compensating transactionsAtomic at the application level
Blocking on failureYes — blocks if coordinator crashes in Phase 2No — each step is independentPartial — Try phase reserves resources
LatencyHigh — two network round-trips minimumLow — each step fires and forgetsMedium — three phases but async-friendly
IsolationFull isolation (locks held across phases)None — intermediate states are visiblePartial — reserved but uncommitted state visible
Rollback complexityAutomatic abort if any participant votes NOMust write explicit compensating transactionsCancel phase must be idempotent
Used inRelational databases (XA), some NoSQLMicroservices (Kafka choreography, Conductor)Payment systems, inventory reservation
Verdict:2PC for strict atomicity within a trust boundary. Saga for long-running workflows across microservices. TCC when you need business-level atomicity with explicit reservation semantics.

Concepts Covered

2PCatomic commitprepare-commitfailure recoveryblockingwrite-ahead log3PCnon-blockingpre-commitsagacompensationeventual consistencytransactionsACIDisolationthree-phase commitCanCommitPreCommitDoCommitnon-blocking commitcoordinator recoveryblocking vs non-blockingrecovery procedurestimeout handlingcoordinator failureparticipant uncertaintynetwork partitionblocking scenariossplit brainsafety vs livenessCAP theoremprotocol comparisonmessage complexityreal-world usageperformance trade-offsPaxos commitconsensus-based commitno single point of failureacceptorsproposerslearnerssaga patterncompensating transactionslocal transactionsrollbacklong-running transactionschoreographyevent-driven architectureservice coordinationno central orchestratorevent publishingorchestrationsaga orchestratorcentral coordinatorstate machinecommand patternsidempotencydeduplicationexactly-once semanticsmessage retriessaga_id + step_ide-commerce sagainventory reservationpayment processingshipment creationreal-world saga

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.