ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/sharder/tasks/task-18-3-1-scatter-gather
TASK

Implementation

Scatter-gather is a fundamental distributed query execution pattern. The coordinator "scatters" a query to all shards, each shard processes its local data, and the coordinator "gathers" partial results into a final response.

Query execution flow:

  1. Client sends a query to the coordinator
  2. Coordinator forwards the query to all known shards
  3. Each shard executes the query on its local data
  4. Each shard returns partial results to the coordinator
  5. Coordinator merges all partial results into a complete response
  6. Coordinator returns the merged response to the client

Handling partial failures:

  • Set a timeout for each shard response (e.g., 1000ms)
  • If a shard times out, exclude its results but continue with other shards
  • Track which shards responded successfully
  • Return a "shards_responded" count so the client knows if results are complete

Example query:

Request:  {"type": "scatter_query", "msg_id": 1, "query": "SELECT * FROM users WHERE age > 25"}
Response: {"type": "scatter_query_ok", "in_reply_to": 1, "results": [...], "shards_total": 3, "shards_responded": 3}

If shard 2 is down:

Response: {"type": "scatter_query_ok", "in_reply_to": 1, "results": [...], "shards_total": 3, "shards_responded": 2}

Sample Test Cases

All shards respond successfullyTimeout: 5000ms
Input
{"src":"c0","dest":"coord","body":{"type":"init","msg_id":1,"shards":["s1","s2","s3"]}}
{"src":"c1","dest":"coord","body":{"type":"scatter_query","msg_id":2,"query":"SELECT * FROM users"}}
Expected Output
{"src": "coord", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
One shard times outTimeout: 5000ms
Input
{"src":"c0","dest":"coord","body":{"type":"init","msg_id":1,"shards":["s1","s2","s3"]}}
{"src":"c1","dest":"coord","body":{"type":"scatter_query","msg_id":2,"query":"SELECT * FROM users","timeout_ms":500}}
Expected Output
{"src": "coord", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}

Hints

Hint 1
The coordinator sends the query to all shards in parallel
Hint 2
Each shard executes the query locally and returns partial results
Hint 3
The coordinator merges partial results into a final response
Hint 4
Use timeouts: if a shard doesn't respond within T ms, proceed without it
Hint 5
Track which shards responded: include a "shards_responded" field in the response
OVERVIEW

Theoretical Hub

Concept overview coming soon

Key Concepts

scatter-gatherquery coordinatorpartial resultstimeout handlingfault tolerance
main.py
python
Implement Scatter-Gather Query Execution - The Sharder | Build Distributed Systems