ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/timekeeper/tasks/task-4-4-4-time-oracle
TASK

Implementation

Build a centralized time oracle service that nodes query for globally consistent HLC timestamps. This avoids the problem of unbounded clock skew between nodes.

Architecture:

  • Primary oracle: maintains an HLC, issues timestamps on request
  • Backup oracle: monitors the primary, takes over on failure
  • Nodes: query the oracle instead of using local clocks

Failure mode: if primary crashes after issuing timestamp T but before the backup knows, the backup must start with T + safety_margin to avoid issuing duplicate timestamps.

Implement handlers:

Request:  {"type": "oracle_get_time", "msg_id": 1}
Response: {"type": "oracle_get_time_ok", "in_reply_to": 1, "pt": 1000, "c": 0, "oracle": "primary"}

Request:  {"type": "oracle_fail_primary", "msg_id": 2}
Response: {"type": "oracle_fail_primary_ok", "in_reply_to": 2, "new_oracle": "backup", "safety_margin_ms": 100}

Request:  {"type": "oracle_get_time", "msg_id": 3}
Response: {"type": "oracle_get_time_ok", "in_reply_to": 3, "pt": 1100, "c": 0, "oracle": "backup"}

Request:  {"type": "oracle_status", "msg_id": 4}
Response: {"type": "oracle_status_ok", "in_reply_to": 4, "primary_alive": false, "active_oracle": "backup", "timestamps_issued": 2}

Sample Test Cases

Primary oracle issues timestampsTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"oracle_get_time","msg_id":2}}
{"src":"c1","dest":"n1","body":{"type":"oracle_get_time","msg_id":3}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Failover to backup oracleTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"oracle_get_time","msg_id":2}}
{"src":"c1","dest":"n1","body":{"type":"oracle_fail_primary","msg_id":3}}
{"src":"c1","dest":"n1","body":{"type":"oracle_get_time","msg_id":4}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Hints

Hint 1
The oracle maintains an HLC and issues globally consistent timestamps
Hint 2
Nodes query the oracle instead of using their own clocks for ordering
Hint 3
If the primary oracle crashes, the backup takes over with a higher counter
Hint 4
The backup oracle must start with a timestamp guaranteed to be higher than any issued by the primary
Hint 5
Use a lease mechanism: the oracle is valid only while its lease is active
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

time oraclecentralized clockfailoverbackup oraclesingle point of failure
main.py
python
Build a Time Oracle Service with Failover - The Timekeeper | Build Distributed Systems