The Counter: Why Incrementing a Number Is Hard Across Machines
Distributed counters reveal a fundamental tension in distributed systems: the lost update problem, why naive replication fails, and how CRDTs let nodes converge to the same value without ever coordinating.
The Increment That Disappears
Imagine two nodes, each holding a copy of a counter initialized at zero. A client sends an increment request to node A. Simultaneously, another client sends an increment request to node B. Both nodes read the current value (zero), add one, and write back the new value (one). The counter should now be two. Both nodes show one. An increment was lost.
This is the lost update problem, and it is one of the oldest and most fundamental challenges in distributed systems. It is not a bug in either node. Both nodes did exactly what they were asked to do. The problem is that the increment operation is not atomic across the distributed system. It is composed of a read followed by a write, and another write can interleave between them.
In a single-process system, you avoid this with a mutex or an atomic CPU instruction. Both serialize the increment. In a distributed system, you cannot take a single lock that spans multiple machines, and you cannot use a CPU atomic instruction across a network.
The Naive Solutions and Why They Fall Short
The most direct solution is to route all writes through a single node. Node A is the designated owner of the counter. All increments go to A. Reads can come from any node. This actually works. Redis uses exactly this model: it is single-threaded, so all operations are serialized, and INCR is genuinely atomic. Redis gets away with this because a single-threaded event loop can handle hundreds of thousands of operations per second, and the simplicity of the model eliminates an entire class of bugs.
The problem with a single designated node is availability. If node A is unreachable, no one can increment the counter. During a network partition, a design that requires routing all writes through one node will choose consistency over availability: it will refuse to accept writes rather than risk lost updates. For a counter tracking page views on a content site, this is probably the wrong tradeoff.
The other approach is to use Compare-And-Swap (CAS). CAS is an atomic operation that checks whether a variable has an expected value and, if it does, updates it to a new value. The counter increment becomes: read the current value, compute the new value, attempt to CAS from old to new. If the CAS fails because another write modified the value in the meantime, retry. CAS-based increment is safe and correct, and it is how most lock-free algorithms are built.
CAS works well when contention is low. When contention is high, many threads or nodes are retrying their CAS operations, and under heavy concurrent load the retry storm can degrade performance badly. CAS-based counters can also fail during network partitions: if the CAS operation cannot reach the node holding the authoritative value, it cannot proceed.
CRDTs: The Mathematical Solution
The insight behind Conflict-free Replicated Data Types (CRDTs) is that you can design data structures that avoid conflicts by construction, so that merging two diverged replicas is always well-defined and always produces the correct result.
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski formalized this in their 2011 paper "A Comprehensive Study of Convergent and Commutative Replicated Data Types." The key mathematical concept is a join semilattice.
A semilattice is a set with a binary operation (the join, or merge) that is:
- Commutative:
merge(A, B) == merge(B, A). Order of merging does not matter. - Associative:
merge(A, merge(B, C)) == merge(merge(A, B), C). Grouping of merges does not matter. - Idempotent:
merge(A, A) == A. Merging something with itself does not change anything.
These three properties mean that no matter what order you receive updates in, no matter how many times you replay the same update, and no matter which way you group your merges, you will always end up at the same state. This is the formal definition of eventual consistency with convergence guarantees.
A CRDT is a data structure designed so that all its operations preserve the semilattice properties. The "conflict-free" in the name does not mean conflicts never happen. It means the data structure is designed so that concurrent modifications can always be merged without losing information.
G-Counters: Grow-Only Counters
The simplest CRDT counter is the G-Counter (grow-only counter). The design is elegant.
Instead of maintaining a single counter value, each node maintains a vector of counters: one entry per node in the cluster. Node A only increments its own entry. Node B only increments its own entry. The total counter value is the sum of all entries.
class GCounter:
def __init__(self, node_id):
self.node_id = node_id
self.counts = {} # node_id -> count
def increment(self, delta=1):
self.counts[self.node_id] = self.counts.get(self.node_id, 0) + delta
def value(self):
return sum(self.counts.values())
def merge(self, other_counts):
for node, count in other_counts.items():
self.counts[node] = max(self.counts.get(node, 0), count)
The merge operation is component-wise maximum. If node A says its local count is 5 and node B says node A's count is 3 (because B has not seen A's recent increments), after merging, both nodes agree that A's count is 5. You can never decrease a node's entry, and the merge always takes the higher value.
This satisfies all three semilattice properties. Commutativity: max(a, b) == max(b, a). Associativity: max(a, max(b, c)) == max(max(a, b), c). Idempotency: max(a, a) == a.
The critical property is that each node only writes to its own slot in the vector. This is why there are no conflicts. Two nodes can never disagree about a node's contribution to the counter, because only that node increments its own slot. When they merge, they simply take the maximum of each slot, which is always the most up-to-date value.
Nodes gossip their state to each other periodically. Each node sends its full counter vector to one or more neighbors. Upon receiving a vector, the recipient merges it with its local state using component-wise max. Over time, all nodes converge to the same total.
PN-Counters: Supporting Decrements
G-Counters are limited: they can only grow. Many useful counters need to support decrements. A user's follower count goes up and down. A stock quantity decreases as items are purchased and increases as inventory is restocked.
The PN-Counter (Positive-Negative counter) solves this with two G-Counters: one for increments, one for decrements.
class PNCounter:
def __init__(self, node_id):
self.positive = GCounter(node_id)
self.negative = GCounter(node_id)
def increment(self, delta=1):
self.positive.increment(delta)
def decrement(self, delta=1):
self.negative.increment(delta)
def value(self):
return self.positive.value() - self.negative.value()
def merge(self, other_positive, other_negative):
self.positive.merge(other_positive)
self.negative.merge(other_negative)
Each node tracks how much it has added (in the positive G-Counter) and how much it has subtracted (in the negative G-Counter). The net value is positive minus negative. Merging merges each G-Counter independently. The CRDT properties are preserved because we are composing two G-Counters, each of which has those properties.
A subtlety: the value of a PN-Counter can go negative at any replica, even if the true global value is positive. If node A has decremented heavily but node B has not yet received those decrements, node B sees a higher value than the true global value. This is the price of eventual consistency: reads may be stale. Writes, however, are always accepted, which is what makes the system available under partition.
How Gossip and Convergence Work Together
A CRDT by itself is just a data structure. For a distributed counter to be useful, nodes need to exchange their state. The standard approach is gossip: periodically, each node picks one or more random neighbors and sends them its current state. The neighbor merges the received state with its own and responds.
The gossip protocol has nice convergence properties. With n nodes and a gossip period of t seconds, the expected time for a single update to reach all nodes is O(log n) gossip rounds. With 100 nodes and a one-second gossip period, a single increment is known everywhere in about 7 seconds. This is the "eventually" in eventual consistency.
The tradeoff is staleness on reads. A client reading from any replica might see a value that is seconds behind the global truth. For many applications, this is acceptable. Page view counts, like counts, follower counts: users do not notice if the number they see is a few seconds old. For financial balances or inventory counts where the stale read could cause real harm, eventual consistency is not acceptable, and you need something stronger.
Riak, the distributed database that pioneered CRDT adoption in production systems, exposes G-Counters, PN-Counters, sets, maps, and flags as first-class data types. When you store a CRDT in Riak, the database handles the merging automatically on read. If two replicas diverged due to a partition, Riak merges them using the CRDT's defined merge function when they reconnect. No data is lost, and no manual conflict resolution is needed.
The Tradeoffs You Are Making
CRDTs make a specific tradeoff that is worth being explicit about. They guarantee that all replicas will eventually converge to the same state. They do not guarantee that any replica shows the true current global state at any given moment.
Consider a PN-Counter tracking the number of active sessions in a distributed web service. A user logs in, node A increments. The user logs out, node A decrements. If you read from node B before it has received the gossip from node A, you see one more active session than there actually are. If your system makes decisions based on the current session count (rate limiting, capacity planning), those decisions might be wrong.
The distributed systems literature calls this the difference between strong consistency and eventual consistency. Strongly consistent systems guarantee that any read reflects the most recent write. This typically requires coordination: before returning a read, the system confirms with a majority of nodes that this is the current value. Coordination takes time, and during a partition, a strongly consistent system that cannot reach its majority will refuse to serve reads or writes.
Eventual consistency trades that guarantee for availability. During a partition, both sides of the partition continue to accept reads and writes. They diverge temporarily. When the partition heals, they converge. No data is lost, but reads may be stale.
For counters specifically, CRDTs are the right answer when you prioritize availability over instantaneous accuracy. SoundCloud uses CRDTs for play count aggregation: it is better to always show a count (even if slightly stale) than to fail to show one. League of Legends uses distributed counters for real-time stat tracking in games, where the game server architecture demands that each node can operate independently.
Beyond Counters
CRDTs are not limited to counters. The same principles apply to sets (a G-Set is a set that only supports add, a 2P-Set supports both add and remove), maps, sequences, and even more complex structures like collaborative text editing.
The insight that makes collaborative text editing in tools like Google Docs and Figma work at scale is a CRDT for text sequences. Each character insertion is given a unique identifier tied to the node that inserted it and the position it was inserted at. Merging concurrent inserts means interleaving the characters in a deterministic way based on their identifiers. Users can type simultaneously without coordination, and the document converges to the same state on all clients.
This is the power of the semilattice: by designing your data structure so that the merge operation has the right mathematical properties, you eliminate the need for coordination entirely. The data structure itself enforces convergence. Coordination is replaced by mathematics.
The practical skill this track teaches is recognizing when a problem can be framed as a CRDT problem. If your state can only grow (or can be represented as the difference of two growing quantities), if reads can tolerate some staleness, and if you need to accept writes under partition, a CRDT will give you availability and convergence without coordination. That is a powerful combination.
Build it yourself
Reading about distributed systems is useful. Building them is how you actually learn.
Start the The Counter track