ARCHIVED from builddistributedsystem.com on 2026-04-28 — URL: https://builddistributedsystem.com/tracks/sharder/tasks/task-8-4-data-migration
TASK

Implementation

Implement data migration between replica groups:

  1. Source group: stop accepting writes for migrating shard
  2. Create snapshot of shard data + client sessions
  3. Send to destination group
  4. Destination: install snapshot, start serving shard
  5. Source: delete shard data after confirmation

Handle failures: retry, idempotency, rollback.

Sample Test Cases

Prepare shard for migrationTimeout: 5000ms
Input
{"src":"c0","dest":"g1","body":{"type":"init","msg_id":1,"node_id":"g1","node_ids":["g1","g2"]}}
{"src":"c0","dest":"g1","body":{"type":"seed_shard","msg_id":2,"shard":3,"data":{"x":1,"y":2}}}
{"src":"c0","dest":"g1","body":{"type":"prepare_migration","msg_id":3,"shard":3,"target_gid":"g2"}}
Expected Output
{"src":"g1","dest":"c0","body":{"type":"init_ok","in_reply_to":1,"msg_id":0}}
{"src":"g1","dest":"c0","body":{"type":"seed_shard_ok","in_reply_to":2,"msg_id":1}}
{"src":"g1","dest":"c0","body":{"type":"prepare_migration_ok","in_reply_to":3,"msg_id":2,"shard":3,"target_gid":"g2","snapshot":{"data":{"x":1,"y":2}}}}

Hints

Hint 1
Stop serving shard during migration
Hint 2
Transfer all key-value pairs
Hint 3
Include client session state
OVERVIEW

Theoretical Hub

Data Migration

Moving shards requires moving data. This must be atomic per shard and consistent. During migration, the shard may be unavailable or served by source (stale reads OK) until transfer completes.

Client Session Transfer

Don't forget client deduplication state. If sessions aren't migrated, clients may see duplicate execution on retry. Transfer the client session table with the shard data.

Key Concepts

migrationdata transferconsistency
main.py
python
Implement Data Migration - The Sharder | Build Distributed Systems