TASK
Implementation
When a client writes data, the primary chunk server coordinates replication to all secondaries. GFS uses a pipeline design where data flows in a chain to maximize network throughput.
Write replication flow:
- Client sends data to the closest chunk server (not necessarily the primary)
- That server forwards the data to the next closest server in the chain
- Data flows as a pipeline: server A -> server B -> server C
- Once all servers have the data cached, the client sends a write request to the primary
- The primary assigns a serial number to the write (for ordering)
- The primary applies the write locally, then forwards the serial order to secondaries
- Secondaries apply the write in the same order
- All servers acknowledge -> primary replies to client
This separates data flow (pipeline for throughput) from control flow (primary for ordering).
Request: {"type": "chunk_write", "msg_id": 1, "chunk_handle": "ch_001", "offset": 0, "data": "hello world", "primary": "cs1", "secondaries": ["cs2", "cs3"]}
Response: {"type": "chunk_write_ok", "in_reply_to": 1, "bytes_written": 11, "replicas_acked": 3, "serial_number": 1}Sample Test Cases
Write replicates to all serversTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1","n2","n3"]}}
{"src":"c1","dest":"n1","body":{"type":"chunk_write","msg_id":2,"chunk_handle":"ch_001","offset":0,"data":"hello","primary":"n1","secondaries":["n2","n3"]}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Sequential writes get increasing serial numbersTimeout: 5000ms
Input
{"src":"c0","dest":"n1","body":{"type":"init","msg_id":1,"node_id":"n1","node_ids":["n1","n2","n3"]}}
{"src":"c1","dest":"n1","body":{"type":"chunk_write","msg_id":2,"chunk_handle":"ch_001","offset":0,"data":"a","primary":"n1","secondaries":["n2","n3"]}}
{"src":"c1","dest":"n1","body":{"type":"chunk_write","msg_id":3,"chunk_handle":"ch_001","offset":1,"data":"b","primary":"n1","secondaries":["n2","n3"]}}
Expected Output
{"src": "n1", "dest": "c0", "body": {"type": "init_ok", "in_reply_to": 1, "msg_id": 0}}
Hints
Hint 1▾
The primary receives the write and forwards it to the secondaries in a pipeline
Hint 2▾
Pipeline: client -> primary -> secondary1 -> secondary2 (data flows in a chain)
Hint 3▾
All three must acknowledge before the write is considered successful
Hint 4▾
If any replica fails, the write fails and the client retries
Hint 5▾
GFS separates data flow (pipeline) from control flow (primary commits order)
OVERVIEW
Theoretical Hub
Concept overview coming soon
Key Concepts
chunk replicationpipeline writesprimary-secondarywrite acknowledgementdata flow
main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/bin/env python3
import sys
import json
def main():
# Your implementation here
for line in sys.stdin:
msg = json.loads(line)
print(json.dumps(msg), flush=True)
if __name__ == "__main__":
main()