Benchmark WAL fsync Strategies - The Logger

                      TASK
                    

Implementation

The fsync system call forces the OS to flush data from kernel buffers to the physical disk. Without it, data that appears "written" may only exist in volatile RAM buffers and will be lost on power failure.

The fundamental tradeoff: durability vs. throughput.

Three strategies, from safest to fastest:

Always fsync: call fsync after every write. Every acknowledged entry is durable. Throughput limited by disk IOPS.
Batch fsync: buffer writes and fsync every 10ms. Up to 10ms of writes can be lost on crash. 10-30x higher throughput.
No fsync: let the OS decide when to flush. Crashes can lose seconds of data. 100x+ higher throughput.

Benchmark all three and measure ops/sec, then plot the durability vs. throughput curve.

Request:  {"type": "fsync_benchmark", "msg_id": 1, "entries": 10000, "strategies": ["always", "batch_10ms", "none"]}
Response: {"type": "fsync_benchmark_ok", "in_reply_to": 1, "results": [
    {"strategy": "always", "ops_per_sec": 500, "durability": "every_write", "data_loss_window": "0ms"},
    {"strategy": "batch_10ms", "ops_per_sec": 15000, "durability": "every_10ms", "data_loss_window": "10ms"},
    {"strategy": "none", "ops_per_sec": 100000, "durability": "os_dependent", "data_loss_window": "seconds"}
]}

Sample Test Cases

Benchmark all three strategiesTimeout: 5000ms

Input

{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"fsync_benchmark","msg_id":2,"entries":100,"strategies":["always","batch_10ms","none"]}}

Expected Output

{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Single strategy benchmarkTimeout: 5000ms

Input

{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"fsync_benchmark","msg_id":2,"entries":50,"strategies":["always"]}}

Expected Output

{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Hints

Hint 1▾

Always fsync: every write is durable on disk. Throughput is limited by disk IOPS (~500 ops/sec on HDD)

Hint 2▾

Batch fsync every 10ms: group writes and sync once per batch. Good balance — can lose up to 10ms of data

Hint 3▾

No fsync: let the OS buffer and flush when it wants. Highest throughput, but crashes can lose seconds of data

Hint 4▾

SSDs have much higher fsync throughput than HDDs (~10,000+ ops/sec)

Hint 5▾

Production systems like PostgreSQL offer wal_sync_method config to choose the strategy

Resources

PostgreSQL WAL Reliability

PostgreSQL documentation on WAL reliability, fsync, and data integrity

                      OVERVIEW
                    

Theoretical Hub

Concept overview coming soon

Key Concepts

fsyncdurabilitythroughput tradeoffbatch syncOS buffering

main.py

python