Guide

How Distributed Nodes Actually Talk to Each Other

Before consensus algorithms and distributed transactions, there is a simpler question: how do two programs on different machines pass a message? The answer shapes everything else.

How Distributed Nodes Actually Talk to Each Other

Before consensus algorithms and distributed transactions, there is a simpler question: how do two programs on different machines pass a message? The answer shapes everything else.

Most distributed systems courses start with CAP theorem or Paxos. That is the wrong place to start. Those are answers to hard problems. Before you can appreciate why those problems are hard, you need to understand the one thing that makes distributed systems different from everything else you have built: nodes cannot share memory.

That sentence sounds obvious, but engineers underestimate it constantly. In a single process, two threads can read and write the same variable. They share state. In a distributed system, there is no shared variable. There is only the network, and the network is a liar.

The Network Is Not a Pipe

When engineers think about network communication for the first time, they tend to imagine it like a pipe. You put a message in one end, it comes out the other. The bytes get there.

This model is wrong in at least three ways.

Messages can be lost. A UDP packet that never arrives produces no error. A TCP connection can be severed mid-transmission. A router can drop packets under load. Your program sends a message into the void and hears nothing back. Did the other node receive it? You do not know.

Messages can be delayed. A message you sent 200 milliseconds ago might arrive right now, or in three seconds, or during a garbage collection pause on the receiving end. The network has no guarantees about when a message arrives, only that it eventually might.

Messages can arrive out of order. You send A, then B. The receiver gets B, then A. This is less common on TCP, but it happens at the application layer all the time when you have multiple connections or retry logic.

Leslie Lamport described this in his 1978 paper on clocks and ordering: in a distributed system, there is no global notion of "now." There is only the order of events within a single node, and the causal relationships between events across nodes established by message passing.

That paper is the first one listed in this track for a reason. Every other concept in distributed systems flows from that observation.

Why Messages?

If shared memory is off the table, you need a different abstraction. The one the entire industry converged on is the message: a self-contained unit of data that travels from one node to another.

"Self-contained" is the key word. A message cannot say "look at the value of X." There is no shared X. A message has to carry everything the receiver needs to process it: what kind of request this is, what data it operates on, and enough context to form a response.

This sounds limiting. It is actually clarifying. When every piece of state must be explicitly passed around, the implicit dependencies in your system become visible. The bugs that plague shared-memory concurrent code, the data races and lock contention, simply do not exist in a message-passing system. Different bugs appear, but they are usually easier to reason about.

Erlang understood this 40 years ago. Joe Armstrong built the entire OTP framework on the principle that processes communicate only through messages. When a process crashes, it does not corrupt shared memory. It just stops sending messages. The rest of the system can detect this and respond.

The Maelstrom Protocol

In this track, you will build nodes that communicate using the Maelstrom protocol, which Kyle Kingsbury designed for his distributed systems testing framework. The protocol is deliberately simple: JSON messages over stdin and stdout.

Each message looks like this:

{
  "src": "c1",
  "dest": "n1",
  "body": {
    "type": "echo",
    "msg_id": 1,
    "echo": "hello there"
  }
}

Three fields at the top level. src tells you who sent it. dest tells you who it is for. body contains the actual payload, and the type field inside body tells the receiver what kind of message this is and how to handle it.

The choice of stdin/stdout instead of actual network sockets is interesting. It means you can test your distributed node logic without any networking infrastructure. You pipe JSON in, you read JSON out. Maelstrom injects network simulation, delays, and message drops at the test harness level, not in your code. This separation is clean.

The real insight is that the protocol is language-agnostic. You can write your node in Go, Python, Rust, or anything that can read from stdin. The protocol does not care.

Message IDs and Request-Response

A pure message-passing system, where you fire messages into the void, is fine for notifications but awkward for requests. When you ask another node for the current value of something, you want a response. How do you match that response to the original request?

The answer is message IDs. You assign each outgoing request a unique identifier and include it in the message. When the receiver responds, it echoes that identifier back. Now you can match responses to requests even if they arrive out of order or after a delay.

{
  "src": "n1",
  "dest": "c1",
  "body": {
    "type": "echo_ok",
    "in_reply_to": 1,
    "echo": "hello there"
  }
}

The in_reply_to field contains the msg_id of the original request. This is the simplest form of correlation. HTTP uses it too, implicitly, through the structure of TCP connections. In a message-passing system where requests and responses are decoupled, you have to be explicit.

This pattern appears at every layer of distributed systems. Raft uses it for log replication. Kafka uses it for offset commits. gRPC uses it for streaming. The details differ but the concept is the same: tag your requests, match your responses.

Initialization: The First Message

When a Maelstrom node starts, the first thing it receives is an init message:

{
  "src": "c0",
  "dest": "n1",
  "body": {
    "type": "init",
    "msg_id": 1,
    "node_id": "n1",
    "node_ids": ["n1", "n2", "n3"]
  }
}

This tells your node its own ID and the IDs of all other nodes in the cluster. Your node must respond with init_ok before the workload begins.

This initialization step exists because node IDs are not static. In a real system, nodes come and go. The cluster membership changes. A new node does not know who its peers are until it is told. Maelstrom simulates this by injecting cluster membership information at startup.

The node_ids list is your first view of the cluster. In later tracks, you will use it to send messages to specific peers, to replicate state, to elect a leader. For now, you just need to store it.

Error Handling: The Part Nobody Reads

Every distributed system needs a way to signal errors back to callers. Maelstrom defines a standard error response:

{
  "type": "error",
  "in_reply_to": 1,
  "code": 11,
  "text": "Node n2 is not available"
}

The code field is important. Maelstrom defines a set of standard error codes, analogous to HTTP status codes, that communicate what kind of failure occurred. Code 11 means "node not found." Code 22 means "not supported." Code 30 means "temporarily unavailable."

Why structured error codes instead of just a text message? Because text messages are for humans. Error codes are for programs. When your retry logic decides whether to retry a request, it needs to know if the failure was transient (try again) or permanent (do not bother). A text message cannot make that distinction reliably. An error code can.

This is the same reason HTTP has 4xx versus 5xx codes. A 404 means the resource does not exist and retrying will not help. A 503 means the server is overloaded and you should try again later. Your application code can make decisions based on these codes. It cannot make decisions based on parsing "Node n2 is not available" as a string.

Building Your First Node

The first task in this track asks you to read JSON messages from stdin and parse them. That is intentionally simple. The complexity comes in the next steps: responding correctly, handling initialization, and building the event loop that keeps your node running.

An event loop is a program that runs forever, waiting for input, processing it, and producing output. Every server is an event loop at some level of abstraction. Your Maelstrom node is an event loop that reads from stdin and writes to stdout.

The pattern looks like this in pseudocode:

while true:
    line = read_line(stdin)
    if line is empty:
        break
    message = parse_json(line)
    response = handle(message)
    if response is not null:
        write_line(stdout, format_json(response))

This is the skeleton of every distributed systems node you will ever write. The handle function is where all the interesting logic lives. In this track, it does almost nothing. In later tracks, it will implement consensus protocols, distributed key-value stores, and gossip algorithms.

But it is always this loop.

The Implications of Async

One thing the synchronous pseudocode above hides is that message-passing systems are fundamentally asynchronous. When node n1 sends a message to n2 and waits for a response, n1 is blocked. While n1 is blocked, it cannot process other incoming messages.

In production systems, this is unacceptable. A node that blocks on every outgoing request quickly becomes a bottleneck. Real nodes handle requests concurrently. They send requests and register callbacks to be invoked when responses arrive.

The Maelstrom framework handles this for you in the early tracks. But as you progress, you will need to think carefully about concurrency in your node's message handler. When you send a request to another node, should you block and wait? Should you fire and forget? Should you use a timeout?

These are not just performance questions. They are correctness questions. A node that blocks indefinitely on a request from a crashed peer will never process another message. A node that uses timeouts and retries might process the same request twice if the original response was just delayed. The tradeoffs here go all the way up to the FLP impossibility result, which proves that no deterministic distributed algorithm can simultaneously guarantee safety and liveness in an asynchronous system with even one possible failure.

You do not need to understand FLP to complete this track. But you should know it is there, waiting.

What You Are Actually Building

By the end of this track, you will have a working Maelstrom node that can receive messages, parse them, and respond correctly. It sounds modest.

What you are actually building is the foundation that every other track in this curriculum sits on. Consensus algorithms need message passing. Gossip protocols need message passing. Distributed key-value stores need message passing. If you build this foundation correctly, the rest follows. If you skip it or misunderstand it, everything that comes after will be shaky.

The engineers who build systems like Kafka and etcd did not start there. They started with the same question you are starting with: how do I get two programs to talk to each other reliably? The answer is more subtle than it looks.

Start simple. Understand the protocol. Get the message loop right.

Everything else builds on this.

Ready to build it? The Messenger track starts with parsing a single JSON message from stdin. By the end, your node will handle initialization, echo requests, and respond with correctly formatted replies. The concepts here underpin every other track in the curriculum.

Build it yourself

Reading about distributed systems is useful. Building them is how you actually learn.

Start the The Messenger track