ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/filesystem/tasks/task-12-2-3-load-balancing
TASK

Implementation

Over time, chunk distribution becomes uneven: new servers start empty, old servers fill up, and some receive more writes. Load balancing moves chunks to equalize utilization.

Rebalancing algorithm:

  1. Periodically compute the average disk utilization across all servers
  2. If any server exceeds average + 20%, it is overloaded; if below average - 20%, it is underloaded
  3. For each overloaded server, select chunks to migrate to underloaded servers
  4. Migration: copy chunk from source to target, update master metadata, delete from source
  5. Run as a background process with rate limiting to avoid saturating the network
Request:  {"type": "rebalance_check", "msg_id": 1}
Response: {"type": "rebalance_check_ok", "in_reply_to": 1, "average_utilization_pct": 50, "overloaded": [{"server": "cs1", "utilization_pct": 78}], "underloaded": [{"server": "cs4", "utilization_pct": 15}]}

Request:  {"type": "rebalance_execute", "msg_id": 2, "moves": [{"chunk": "ch_010", "from": "cs1", "to": "cs4"}]}
Response: {"type": "rebalance_execute_ok", "in_reply_to": 2, "moved": 1, "bytes_transferred": 67108864}

Sample Test Cases

Check identifies overloaded and underloaded serversTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"rebalance_check","msg_id":2}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Execute rebalance moves chunksTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"rebalance_execute","msg_id":2,"moves":[{"chunk":"ch_010","from":"cs1","to":"cs4"}]}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Hints

Hint 1
Monitor each server disk usage; if imbalance exceeds 20%, trigger rebalancing
Hint 2
Move chunks from overloaded servers to underloaded ones
Hint 3
Rebalancing is a background operation — do not disrupt active reads/writes
Hint 4
After moving a chunk, update the master metadata and notify affected servers
Hint 5
Prefer moving chunks that are infrequently accessed to minimize disruption
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

load balancingchunk migrationdisk utilizationrebalancing threshold
main.py
python
Implement Chunk Server Load Balancing - The Filesystem | Build Distributed Systems