ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/projects/mini-spanner
Projects/Mini-Spanner
expert
4 tracks|7 tasks|22h estimated

Build Mini-Spanner

Build a globally consistent distributed database from scratch. Implement Paxos consensus, TrueTime external consistency, two-phase commit across shards, MVCC snapshot reads, and a SQL query router.

consensuspaxostransactionsmvccsqltruetimespanner
Google Spanner Architecture Zone A — US-East Spanserver 1 Spanserver 2 Paxos Group (Leader) Local MVCC Storage GPS Clock Atomic Clock Zone B — EU-West Spanserver 3 Spanserver 4 Paxos Group (Follower) Local MVCC Storage GPS Clock Atomic Clock Zone C — Asia-Pacific Spanserver 5 Spanserver 6 Paxos Group (Follower) Local MVCC Storage GPS Clock Atomic Clock Paxos replication TrueTime Service GPS + Atomic clocks epsilon < 7ms globally TrueTime sync SQL Client (F1) SQL query 2PC Coordinator (this zone) 2PC PREPARE/COMMIT External consistency via commit wait

Google Spanner

Spanner is Google's globally distributed relational database, deployed across dozens of datacenters on every continent. It is the only production system that provides both external consistency (stronger than serializability) and full SQL semantics at global scale. Introduced in the OSDI 2012 paper by Corbett et al., Spanner became publicly available as Cloud Spanner in 2017 and now underpins Google Ads, Google Play, YouTube, and Google's financial infrastructure.

What You Will Build

You will implement the core layers of Spanner from first principles:

  1. Paxos Consensus — the replication backbone. Each shard is a Paxos group of 5 replicas spread across datacenters. Implement Phase 1 (Prepare/Promise) and Phase 2 (Accept/Commit) to reach agreement on a value even when some replicas fail.
  2. TrueTime — Spanner's bounded-uncertainty clock API. Rather than pretending clocks are synchronized, TrueTime exposes the uncertainty window [earliest, latest] and derives external consistency through commit wait.
  3. Distributed Transactions — two-phase commit coordinated across Paxos groups, combined with MVCC storage. Read-write transactions acquire locks and commit atomically; snapshot reads are lock-free, served from any replica.
  4. SQL Engine — a query router that maps SQL INSERT and SELECT statements to the correct shard by key range, and executes cross-shard range scans by merging ordered results.

The Central Idea

Spanner's key insight is that clock uncertainty can be made explicit and then reasoned about precisely. By forcing transactions to wait out their own uncertainty window before committing (commit wait), Spanner guarantees that commit timestamps reflect real-world causal order — without any global clock synchronization. This makes distributed snapshot reads straightforward: pick a timestamp, wait for replicas to catch up, and read without locks.

Prerequisites

Comfort with Python or Go. Familiarity with basic consensus ideas (voting, quorums) is helpful. You do not need prior knowledge of Paxos or distributed transactions — this project builds up from first principles.

Tracks

Paxos Consensus

0/2 completed

Implement the Paxos consensus protocol that replicates each Spanner shard. Paxos Phase 1 collects promises and recovers any previously accepted value; Phase 2 broadcasts the value and commits once a majority accepts.

Paxos Consensus

TrueTime

0/2 completed

Build Spanner's TrueTime API: return bounded uncertainty intervals, implement TT.after and TT.before predicates, enforce commit wait to guarantee external consistency, and track safe time for linearizable reads.

TrueTime and External Consistency

Distributed Transactions

0/2 completed

Build the transaction layer: two-phase commit that atomically applies writes across multiple shards, and MVCC storage that maintains versioned values for lock-free snapshot reads at any past timestamp.

Distributed Transactions

SQL Engine

0/1 completed

Layer SQL on top of distributed storage. Route INSERT and SELECT queries to the correct shard by key range, execute cross-shard range scans, and merge results in sorted order.

SQL over Distributed Storage