TASK
Implementation
A scheduler that crashes loses all in-flight job assignments. A fault-tolerant scheduler writes every decision to a WAL before acting, so it can replay the log on restart and recover exactly where it left off without re-assigning jobs twice.
Implement a node that survives crashes and prevents duplicate assignments:
// After crash and restart, recover from WAL
{ "type": "scheduler_restart" }
-> { "type": "recovery_complete",
"job_assignments": {"job1": "w1"},
"pending_notifications": ["w1"] }
// Primary crashes -> elect new leader from replicas
{ "type": "leader_crash", "leader_id": "s1" }
-> { "type": "new_leader_elected",
"old_leader": "s1", "new_leader": "s2", "term": 2 }
// Generation number prevents duplicate assignment
assign job1->w1 (gen=1) // accepted
assign job1->w2 (gen=1) // duplicate same generation -> rejected
-> { "type": "assignment_rejected",
"reason": "Job already assigned in generation 1",
"existing_assignment": {"job_id": "job1", "worker": "w1"} }Sample Test Cases
Scheduler crash recoveryTimeout: 5000ms
Input
{"src":"client","dest":"scheduler","body":{"type":"init","msg_id":1,"replicas":["s1","s2","s3"]}}
{"type":"assign_job","job_id":"job1","worker_id":"w1"}
{"type":"scheduler_crash"}
{"type":"scheduler_restart"}
Expected Output
{"type": "recovery_complete", "job_assignments": {"job1": "w1"}, "pending_notifications": ["w1"]}Leader election after crashTimeout: 5000ms
Input
{"src":"client","dest":"scheduler_cluster","body":{"type":"init","msg_id":1,"replicas":["s1","s2","s3"]}}
{"type":"leader_crash","leader_id":"s1"}
Expected Output
{"type": "new_leader_elected", "old_leader": "s1", "new_leader": "s2", "term": 2}Hints
Hint 1▾
WAL: write every assignment to the log before applying it in memory
Hint 2▾
On restart, replay the WAL sequentially to rebuild the in-memory assignment map
Hint 3▾
Generation numbers (epochs): reject an assignment message if the same job is already assigned in that generation
Hint 4▾
After recovery, resend pending assignment notifications to all affected workers
Hint 5▾
Leader election: highest-ID surviving replica becomes the new leader
OVERVIEW
Theoretical Hub
Concept overview coming soon
Key Concepts
WALcrash recoveryleader electiongeneration numbersduplicate prevention
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/bin/env python3
import sys
import json
def main():
# Your implementation here
for line in sys.stdin:
msg = json.loads(line)
print(json.dumps(msg), flush=True)
if __name__ == "__main__":
main()