ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/scheduler/tasks/task-22-1-2-deadlock-prevention
TASK

Implementation

Deadlock happens when two jobs each hold a resource the other needs, so neither can proceed. Prevention is better than detection: refuse any allocation that would leave the system in an unsafe state.

Implement a node that manages resource allocation with deadlock prevention:

// Initialize resource pool
{ "type": "init", "msg_id": 1,
  "resources": {"total_cpu": 16, "total_memory": 64} }
-> { "type": "init_ok", "in_reply_to": 1 }

// Safe allocation: system remains in safe state
{ "type": "allocate_resources", "msg_id": 2,
  "job_id": "job1", "resources": {"cpu": 4, "memory": 16} }
-> { "type": "allocation_ok", "in_reply_to": 2,
    "job_id": "job1", "safe_state": true }

// Would exhaust resources -> unsafe -> deny
{ "type": "allocate_resources", "msg_id": 3,
  "job_id": "job2", "resources": {"cpu": 8, "memory": 32} }
-> denied

// Waiting too long -> preempt
{ "type": "allocate_resources", "msg_id": 4,
  "job_id": "job3", "resources": {"cpu": 16, "memory": 64}, "timeout_ms": 5000 }
-> { "type": "allocation_timeout", "in_reply_to": 4,
    "job_id": "job3", "action": "preempted" }

// Inspect wait-for graph
{ "type": "get_wait_graph", "msg_id": 5 }
-> { "type": "wait_graph_ok", "in_reply_to": 5,
    "graph": {"job1":["job3"],"job2":["job1"],"job3":["job2"]},
    "has_cycle": true }

Sample Test Cases

Detect deadlock cycleTimeout: 5000ms
Input
{"src":"client","dest":"scheduler","body":{"type":"init","msg_id":1,"resources":{"total_cpu":16,"total_memory":64}}}
{"src":"client","dest":"scheduler","body":{"type":"allocate_resources","msg_id":2,"job_id":"job1","resources":{"cpu":8,"memory":32}}}
{"src":"client","dest":"scheduler","body":{"type":"allocate_resources","msg_id":3,"job_id":"job2","resources":{"cpu":8,"memory":32}}}
Expected Output
{"src": "scheduler", "dest": "client", "body": {"type": "init_ok", "in_reply_to": 1}}
Safe state allocationTimeout: 5000ms
Input
{
  "src": "client",
  "dest": "scheduler",
  "body": {
    "type": "allocate_resources",
    "msg_id": 1,
    "job_id": "job1",
    "resources": {
      "cpu": 4,
      "memory": 16
    }
  }
}
Expected Output
{"src": "scheduler", "dest": "client", "body": {"type": "allocation_ok", "in_reply_to": 1, "job_id": "job1", "safe_state": true}}

Hints

Hint 1
A system is in a safe state if there is an ordering in which all jobs can eventually complete
Hint 2
Reject an allocation that would make the system unsafe (Banker's algorithm)
Hint 3
Wait-for graph: edge A->B means A is waiting for a resource currently held by B
Hint 4
A cycle in the wait-for graph means deadlock exists
Hint 5
Preempt a job that has been waiting beyond timeout_ms and free its held resources
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

Banker's algorithmsafe statewait-for graphpreemptioncycle detection
main.py
python
Implement Deadlock Prevention in Scheduling - The Scheduler | Build Distributed Systems