TASK
Implementation
Networks can duplicate messages. If a sender retries because it did not receive an acknowledgment (but the original was actually delivered), the receiver sees the same request twice. Without deduplication, the receiver processes it twice, which can cause incorrect state.
Your task is to implement message deduplication:
- Maintain a bounded LRU cache of recently seen message IDs (capacity: 1000)
- The deduplication key is
(src, msg_id)— msg_id alone is not globally unique - When a duplicate is detected, skip processing but still reply (to ensure the sender's retry gets acknowledged)
- Track and report deduplication statistics
Implement a dedup_echo message type that behaves like echo but applies deduplication:
Request: {"type": "dedup_echo", "msg_id": 5, "echo": "hello"}
Response: {"type": "dedup_echo_ok", "in_reply_to": 5, "echo": "hello", "was_duplicate": false}If the same (src, msg_id) pair is seen again:
Response: {"type": "dedup_echo_ok", "in_reply_to": 5, "echo": "hello", "was_duplicate": true}Implement a dedup_stats message to report statistics:
Response: {"type": "dedup_stats_ok", "total": 10, "duplicates": 2, "cache_size": 8}Sample Test Cases
First dedup_echo is not a duplicateTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"dedup_echo","msg_id":10,"echo":"first"}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
{"src": "n1", "dest": "c1", "body": {"type": "dedup_echo_ok", "echo": "first", "was_duplicate": false, "in_reply_to": 10, "msg_id": 1}}
Duplicate msg_id from same source detectedTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"dedup_echo","msg_id":10,"echo":"hello"}}
{"src":"c1","dest":"n1","body":{"type":"dedup_echo","msg_id":10,"echo":"hello"}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
{"src": "n1", "dest": "c1", "body": {"type": "dedup_echo_ok", "echo": "hello", "was_duplicate": false, "in_reply_to": 10, "msg_id": 1}}
{"src": "n1", "dest": "c1", "body": {"type": "dedup_echo_ok", "echo": "hello", "was_duplicate": true, "in_reply_to": 10, "msg_id": 2}}
Hints
Hint 1▾
Use (src, msg_id) as the deduplication key since msg_id is only unique per sender
Hint 2▾
An OrderedDict works well as a bounded LRU cache in Python
Hint 3▾
When the cache exceeds its capacity, remove the oldest entry
Hint 4▾
Skip processing for duplicate messages but still acknowledge receipt
Hint 5▾
Log duplicates to stderr for debugging
OVERVIEW
Theoretical Hub
Concept overview coming soon
Key Concepts
idempotencydeduplicationLRU cacheat-most-once delivery
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/usr/bin/env python3
import sys
import json
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity=1000):
self.capacity = capacity
self.cache = OrderedDict()
def contains(self, key):
# TODO: Check if key exists (and move to end for LRU)
pass
def add(self, key):
# TODO: Add key to cache, evicting oldest if at capacity
pass
def size(self):
return len(self.cache)
class Node:
def __init__(self):
self.node_id = None
self.node_ids = []
self.next_msg_id = 0
self.seen = LRUCache(capacity=1000)
self.total_messages = 0
self.duplicate_count = 0
def send(self, dest, body):
body["msg_id"] = self.next_msg_id
self.next_msg_id += 1
message = {"src": self.node_id, "dest": dest, "body": body}
print(json.dumps(message), flush=True)
def reply(self, request, body):
body["in_reply_to"] = request["body"]["msg_id"]
self.send(request["src"], body)