Add Request Deduplication - Proxies | Build Distributed Systems

                      TASK
                    

Implementation

Deduplicate identical concurrent requests to reduce backend load:

Compute a request key (e.g., hash of method + URL + body)
If a request with the same key is already in-flight, wait for it
When the original completes, return the same response to all waiters
After response, remove from in-flight set

This is especially valuable for hot endpoints with many identical requests.

Sample Test Cases

Deduplicate concurrentTimeout: 5000ms

Input

{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": [
      "n1"
    ]
  }
}

Expected Output

{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}

Hints

Hint 1▾

Track in-flight requests by key

Hint 2▾

Have duplicates wait for original

Hint 3▾

Return same response to all waiters

                      OVERVIEW
                    

Theoretical Hub

Request Deduplication

When many clients request the same resource simultaneously, sending all requests to the backend wastes resources. Deduplication sends one request and shares the response, reducing backend load dramatically.

Request Fingerprinting

The dedup key must uniquely identify functionally equivalent requests. For GET requests, URL is often sufficient. For POST, you may need to hash the body. Be careful with headers that affect response.

Key Concepts

deduplicationidempotencyrequest coalescing

main.py

python

#!/usr/bin/env python3
import sys
import json
import threading
import hashlib
class DedupProxy:
    def __init__(self, backend):
        self.backend = backend
        self.in_flight = {}  # key -> threading.Event
        self.responses = {}  # key -> response
        self.lock = threading.Lock()
    
    def compute_key(self, request):
        # TODO: Create unique key for request
        return hashlib.md5(json.dumps(request).encode()).hexdigest()
    
    def handle_request(self, request):
        key = self.compute_key(request)
        
        with self.lock:
            if key in self.in_flight:
                # TODO: Wait for existing request
                pass
            else:
                # TODO: Start new request, others will wait
                pass
        
        # TODO: Return response
if __name__ == "__main__":
    pass