ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/advanced/tasks/task-10-1-mapreduce
TASK

Implementation

Implement MapReduce: Map emits (key, value) pairs, shuffle groups by key, Reduce aggregates. Build word count as example.

Sample Test Cases

Map emits key-value pairsTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1"]}}
{"src":"c1","dest":"n1","body":{"type":"mapreduce_map","msg_id":2,"data":["hello world","hello"],"mapper":"word_count"}}
Expected Output
{"src":"n1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
{"src":"n1","dest":"c1","body":{"type":"mapreduce_map_ok","in_reply_to":2,"msg_id":1,"mapped":[["hello",1],["world",1],["hello",1]]}}

Hints

Hint 1
Map phase: emit key-value pairs
Hint 2
Shuffle: group by key
Hint 3
Reduce phase: aggregate values
OVERVIEW

Theoretical Hub

MapReduce

MapReduce splits batch jobs into parallelizable map and reduce phases. Map transforms data, Reduce aggregates. Shuffle handles data movement between phases.

Key Concepts

MapReducebatch processingword count
main.py
python
Implement MapReduce - Advanced | Build Distributed Systems