ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/queues/tasks/task-15-5-dlq
TASK

Implementation

Implement dead letter queues for failed messages:

  1. Track retry count for each message
  2. On processing failure, increment retry count
  3. After N failures, move to dead letter queue
  4. Preserve: original message, error details, timestamps
  5. Provide interface to inspect and replay DLQ messages

DLQs prevent poison messages from blocking the queue.

Sample Test Cases

Move to DLQ after retriesTimeout: 5000ms
Input
{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": [
      "n1"
    ]
  }
}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
Replay from DLQTimeout: 5000ms
Input
{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": [
      "n1"
    ]
  }
}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}

Hints

Hint 1
Track retry count per message
Hint 2
Move to DLQ after max retries
Hint 3
Preserve error information
OVERVIEW

Theoretical Hub

Dead Letter Queues

Some messages may never succeed: invalid format, missing data, bugs. Instead of retrying forever or losing them, move failures to a DLQ for investigation. Operators can fix issues and replay messages.

Poison Messages

A poison message is one that consistently fails processing. Without DLQ, it blocks the queue (if ordered) or wastes resources (if retried forever). DLQ isolates the poison so healthy messages flow.

Key Concepts

DLQpoison messageerror handling
main.py
python
Add Dead Letter Queues - Queues | Build Distributed Systems