ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/loadbalancers/tasks/task-20-2-5-rate-limiting
TASK

Implementation

Rate limiting protects backend services from being overwhelmed by too many requests. It prevents abuse, ensures fair usage, and maintains service availability during traffic spikes.

Token bucket algorithm:

bucket: {
    tokens: number,           // Current tokens (max: capacity)
    capacity: number,         // Max tokens (burst allowance)
    refillRate: number,       // Tokens added per second
    lastRefill: timestamp     // Last refill time
}

function allowRequest(): boolean {
    now = currentTime();
    elapsed = now - bucket.lastRefill;
    bucket.tokens += elapsed * bucket.refillRate;
    bucket.tokens = min(bucket.tokens, bucket.capacity);
    bucket.lastRefill = now;

    if (bucket.tokens >= 1) {
        bucket.tokens -= 1;
        return true;  // Allow request
    }
    return false;  // Rate limited
}

Rate limit configuration:

{
  "rate_limits": {
    "per_ip": {
      "requests_per_second": 10,
      "burst": 20
    },
    "per_api_key": {
      "free_tier": {"requests_per_second": 1, "burst": 5},
      "paid_tier": {"requests_per_second": 100, "burst": 200}
    }
  }
}

Example rate limiting:

// First 10 requests succeed:
Request:  {"type": "http_request", "msg_id": 1, "method": "GET", "path": "/api/users", "client_ip": "1.2.3.4"}
Response: {"type": "http_response", "in_reply_to": 1, "status": 200, "headers": {"X-RateLimit-Remaining": 9, "X-RateLimit-Limit": 10, "X-RateLimit-Reset": 1680123460}}

// 11th request is rate limited:
Request:  {"type": "http_request", "msg_id": 11, "method": "GET", "path": "/api/users", "client_ip": "1.2.3.4"}
Response: {"type": "http_response", "in_reply_to": 11, "status": 429, "error": "Rate limit exceeded", "headers": {"X-RateLimit-Remaining": 0, "Retry-After": 1}}

Per-API-key rate limiting:

Request:  {"type": "http_request", "msg_id": 1, "method": "GET", "path": "/api/users", "headers": {"X-API-Key": "key_free_tier"}, "client_ip": "1.2.3.4"}
Response: {"type": "http_response", "in_reply_to": 1, "status": 200, "headers": {"X-RateLimit-Remaining": 0, "X-RateLimit-Limit": 1}}

Sample Test Cases

Enforce per-IP rate limitTimeout: 5000ms
Input
{"src":"client","dest":"l7_proxy","body":{"type":"init","msg_id":1,"rate_limits":{"per_ip":{"requests_per_second":10,"burst":20}}}}
{"src":"client","dest":"l7_proxy","body":{"type":"http_request","msg_id":2,"method":"GET","path":"/api/data","client_ip":"1.2.3.4"},"send_times":11}
Expected Output
{"src": "l7_proxy", "dest": "client", "body": {"type": "init_ok", "in_reply_to": 1}}
Rate limit headers includedTimeout: 5000ms
Input
{
  "src": "client",
  "dest": "l7_proxy",
  "body": {
    "type": "http_request",
    "msg_id": 1,
    "method": "GET",
    "path": "/api/data",
    "client_ip": "1.2.3.4"
  }
}
Expected Output
{"src": "l7_proxy", "dest": "client", "body": {"type": "http_response", "in_reply_to": 1, "status": 200, "headers": {"X-RateLimit-Remaining": .*, "X-RateLimit-Limit": 10}}}

Hints

Hint 1
Use token bucket algorithm: refill tokens at a fixed rate, consume tokens per request
Hint 2
Track rate limits per IP address and per API key
Hint 3
Return 429 Too Many Requests when bucket is empty
Hint 4
Include rate limit headers: X-RateLimit-Remaining, X-RateLimit-Reset
Hint 5
Burst allowance: allow short bursts above sustained rate
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

rate limitingtoken bucketper-IP limitsper-API-key limitsDDoS protection
main.py
python
Implement Rate Limiting - Load Balancers | Build Distributed Systems