ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/proxies/tasks/task-12-5-health-routing
TASK

Implementation

Add health-based routing with circuit breaker pattern:

  1. Maintain list of backend servers
  2. Periodically health-check each backend
  3. Track consecutive failures per backend
  4. Open circuit after N failures (stop sending traffic)
  5. Periodically test with single request (half-open)
  6. Close circuit on success (resume normal traffic)

This improves reliability by routing away from failing backends.

Sample Test Cases

Route to healthyTimeout: 5000ms
Input
{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": [
      "n1"
    ]
  }
}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
Open circuit on failuresTimeout: 5000ms
Input
{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": [
      "n1"
    ]
  }
}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}

Hints

Hint 1
Periodically check backend health
Hint 2
Track failure counts per backend
Hint 3
Implement circuit breaker pattern
OVERVIEW

Theoretical Hub

Health Checks

Active health checks periodically probe backends with test requests. Passive checks observe real request failures. Both inform routing decisions for fast failure detection.

Circuit Breaker

The circuit breaker prevents cascading failures. After enough failures, the circuit "opens" and requests fail immediately without trying the backend. After a timeout, one test request checks recovery (half-open state).

Key Concepts

health checkscircuit breakerfailover
main.py
python
Add Health-Based Routing - Proxies | Build Distributed Systems