TASK
Implementation
Add health-based routing with circuit breaker pattern:
- Maintain list of backend servers
- Periodically health-check each backend
- Track consecutive failures per backend
- Open circuit after N failures (stop sending traffic)
- Periodically test with single request (half-open)
- Close circuit on success (resume normal traffic)
This improves reliability by routing away from failing backends.
Sample Test Cases
Route to healthyTimeout: 5000ms
Input
{
"src": "c0",
"dest": "n1",
"body": {
"type": "init",
"msg_id": 1,
"node_id": "n1",
"node_ids": [
"n1"
]
}
}Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}Open circuit on failuresTimeout: 5000ms
Input
{
"src": "c0",
"dest": "n1",
"body": {
"type": "init",
"msg_id": 1,
"node_id": "n1",
"node_ids": [
"n1"
]
}
}Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}Hints
Hint 1▾
Periodically check backend health
Hint 2▾
Track failure counts per backend
Hint 3▾
Implement circuit breaker pattern
OVERVIEW
Theoretical Hub
Health Checks
Active health checks periodically probe backends with test requests. Passive checks observe real request failures. Both inform routing decisions for fast failure detection.
Circuit Breaker
The circuit breaker prevents cascading failures. After enough failures, the circuit "opens" and requests fail immediately without trying the backend. After a timeout, one test request checks recovery (half-open state).
Key Concepts
health checkscircuit breakerfailover
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#!/usr/bin/env python3
import sys
import json
import threading
import time
from enum import Enum
class CircuitState(Enum):
CLOSED = "closed" # Normal operation
OPEN = "open" # Failing, reject requests
HALF_OPEN = "half_open" # Testing recovery
class CircuitBreaker:
def __init__(self, failure_threshold=5, recovery_timeout=30):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.state = CircuitState.CLOSED
self.last_failure_time = None
self.lock = threading.Lock()
def record_success(self):
# TODO: Reset failure count, close circuit
pass
def record_failure(self):
# TODO: Increment failures, maybe open circuit
pass
def can_execute(self):
# TODO: Check if request should be allowed
pass
class HealthyProxy:
def __init__(self, backends):
self.backends = backends