ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/logger/tasks/task-10-1-1-wal-impl
TASK

Implementation

The Write-Ahead Log (WAL) is the most fundamental durability primitive in distributed systems. Every database (PostgreSQL, MySQL, SQLite), every consensus algorithm (Raft, Paxos), and every message broker (Kafka) uses a WAL at its core.

The key insight: log the change BEFORE applying it to state. If the system crashes at any point, the WAL can be replayed to recover the exact pre-crash state.

Implement a WAL with these properties:

  1. Append-only: entries are never modified, only appended
  2. Checksummed: each entry includes a CRC32 checksum to detect corruption
  3. Durable: entries are fsynced to disk before acknowledging

Entry format: [4-byte length][4-byte CRC32 checksum][payload bytes]

Request:  {"type": "wal_append", "msg_id": 1, "payload": "set x=42"}
Response: {"type": "wal_append_ok", "in_reply_to": 1, "offset": 0, "length": 8, "checksum": "abc123"}

Request:  {"type": "wal_read", "msg_id": 2, "offset": 0}
Response: {"type": "wal_read_ok", "in_reply_to": 2, "payload": "set x=42", "checksum_valid": true}

Request:  {"type": "wal_info", "msg_id": 3}
Response: {"type": "wal_info_ok", "in_reply_to": 3, "entries": 1, "total_bytes": 16, "fsynced": true}

Sample Test Cases

Append a single entry and read it backTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"wal_append","msg_id":2,"payload":"set x=42"}}
{"src":"c1","dest":"n1","body":{"type":"wal_read","msg_id":3,"offset":0}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_append_ok", "in_reply_to": 2, "offset": 0, "msg_id": 1}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_read_ok", "in_reply_to": 3, "payload": "set x=42", "checksum_valid": true, "msg_id": 2}}
Multiple appends increment offsetTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"wal_append","msg_id":2,"payload":"op1"}}
{"src":"c1","dest":"n1","body":{"type":"wal_append","msg_id":3,"payload":"op2"}}
{"src":"c1","dest":"n1","body":{"type":"wal_append","msg_id":4,"payload":"op3"}}
{"src":"c1","dest":"n1","body":{"type":"wal_info","msg_id":5}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_append_ok", "in_reply_to": 2, "offset": 0, "msg_id": 1}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_append_ok", "in_reply_to": 3, "offset": 1, "msg_id": 2}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_append_ok", "in_reply_to": 4, "offset": 2, "msg_id": 3}}
{"src": "n1", "dest": "c1", "body": {"type": "wal_info_ok", "in_reply_to": 5, "entries": 3, "msg_id": 4}}

Hints

Hint 1
Each entry format: [4-byte length][4-byte CRC32 checksum][payload bytes]
Hint 2
Write to disk and fsync BEFORE acknowledging the caller — this is the "write-ahead" guarantee
Hint 3
The checksum detects torn writes: if a crash happens mid-write, the partial entry will have an invalid checksum
Hint 4
Use CRC32 for checksums — fast and sufficient for detecting corruption
Hint 5
The WAL is append-only: never modify existing entries, only add new ones at the end
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

write-ahead logappend-onlychecksumdurabilityfsync
main.py
python
Implement a Write-Ahead Log - The Logger | Build Distributed Systems