TASK
Implementation
Use consistent hashing for shard assignment:
- Place shards on a hash ring
- Hash each key to a ring position
- Find the next shard clockwise from key position
- Use virtual nodes for better distribution
- Minimize key movement when shards join/leave
Sample Test Cases
Hash key to ringTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"hash_key","msg_id":2,"key":"mykey"}}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
{"src":"n1","dest":"c1","body":{"type":"hash_key_ok","in_reply_to":2,"msg_id":1,"key":"mykey"}}
Hints
Hint 1▾
Hash keys to ring positions
Hint 2▾
Use virtual nodes for balance
Hint 3▾
Minimal movement on changes
OVERVIEW
Theoretical Hub
Consistent Hashing for Sharding
Traditional modulo hashing (key % N) redistributes most keys when N changes. Consistent hashing only moves keys between affected neighbors, minimizing data migration during rebalancing.
Virtual Nodes
With few shards, the ring may be imbalanced. Virtual nodes give each shard multiple ring positions, smoothing distribution. Typically 100-200 virtual nodes per shard.
Key Concepts
consistent hashingkey distributionvirtual nodes
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env python3
import sys
import json
import hashlib
import bisect
class ConsistentHashSharding:
def __init__(self, replicas=100):
self.replicas = replicas
self.ring = {}
self.sorted_keys = []
self.shards = set()
def _hash(self, key):
return int(hashlib.sha256(key.encode()).hexdigest(), 16)
def add_shard(self, shard_id):
# TODO: Add shard with virtual nodes
pass
def remove_shard(self, shard_id):
# TODO: Remove shard from ring
pass
def get_shard(self, key):
# TODO: Find shard for key
pass
def get_keys_to_migrate(self, old_shards, new_shards):
# TODO: Calculate which keys move
pass