TASK
Implementation
Over time, chunk distribution becomes uneven: new servers start empty, old servers fill up, and some receive more writes. Load balancing moves chunks to equalize utilization.
Rebalancing algorithm:
- Periodically compute the average disk utilization across all servers
- If any server exceeds average + 20%, it is overloaded; if below average - 20%, it is underloaded
- For each overloaded server, select chunks to migrate to underloaded servers
- Migration: copy chunk from source to target, update master metadata, delete from source
- Run as a background process with rate limiting to avoid saturating the network
Request: {"type": "rebalance_check", "msg_id": 1}
Response: {"type": "rebalance_check_ok", "in_reply_to": 1, "average_utilization_pct": 50, "overloaded": [{"server": "cs1", "utilization_pct": 78}], "underloaded": [{"server": "cs4", "utilization_pct": 15}]}
Request: {"type": "rebalance_execute", "msg_id": 2, "moves": [{"chunk": "ch_010", "from": "cs1", "to": "cs4"}]}
Response: {"type": "rebalance_execute_ok", "in_reply_to": 2, "moved": 1, "bytes_transferred": 67108864}Sample Test Cases
Check identifies overloaded and underloaded serversTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"rebalance_check","msg_id":2}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Execute rebalance moves chunksTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"rebalance_execute","msg_id":2,"moves":[{"chunk":"ch_010","from":"cs1","to":"cs4"}]}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Hints
Hint 1▾
Monitor each server disk usage; if imbalance exceeds 20%, trigger rebalancing
Hint 2▾
Move chunks from overloaded servers to underloaded ones
Hint 3▾
Rebalancing is a background operation — do not disrupt active reads/writes
Hint 4▾
After moving a chunk, update the master metadata and notify affected servers
Hint 5▾
Prefer moving chunks that are infrequently accessed to minimize disruption
OVERVIEW
Theoretical Hub
Concept overview coming soon
Key Concepts
load balancingchunk migrationdisk utilizationrebalancing threshold
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/bin/env python3
import sys
import json
def main():
# Your implementation here
for line in sys.stdin:
msg = json.loads(line)
print(json.dumps(msg), flush=True)
if __name__ == "__main__":
main()