Distributed Caches: What Goes Wrong When You Add a Second Node

A cache on a single machine is simple. A cache distributed across ten machines introduces problems that took the industry a decade to solve properly.

Distributed Caches: What Goes Wrong When You Add a Second Node

A cache on a single machine is simple. Store frequently read data in memory, serve it fast, evict old entries when memory fills. A cache distributed across ten machines introduces problems that took the industry a decade to solve properly.

The problems are not just about data placement and load balancing, though those are hard. The deeper problems are about consistency. When data lives in multiple places and can be updated independently, you have to decide what "correct" means for a read. The answer depends on your application, and getting it wrong produces bugs that only appear under load.

Why Caches Exist at the Scale of Distributed Systems

At small scale, you cache because the database is slow. A query that takes 50 milliseconds can be served in 1 millisecond from memory. If that query runs 10,000 times per second, the cache saves you 490 seconds of database time per second, which is not physically possible without caching.

At large scale, you cache because the database cannot handle the read load at all. Facebook in 2013 had billions of reads per day on a small number of hot objects — celebrity profiles, popular posts. No database could serve that directly. The data had to live closer to the application servers.

Memcached was Facebook's answer. They ran it on thousands of servers, sharding objects across the fleet. A read went to the cache first; if the cache missed, it went to MySQL and the result was written back. This pattern is so common it has a name: look-aside caching.

The Consistency Problem

Look-aside caching is simple but creates an immediate consistency problem. When the underlying data changes in the database, the cached copy is stale. How long can you tolerate stale reads?

For some use cases — user profile views, product catalog prices — a few seconds of staleness is fine. The user reloads the page and gets fresh data. For others — account balances, inventory counts, any data where incorrect reads have real consequences — staleness is not acceptable.

The naive solution to staleness is cache invalidation: when you write to the database, also delete the corresponding cache entry. The next read will miss and populate the cache with fresh data.

Cache invalidation is famously hard. Phil Karlton's oft-quoted observation that there are only two hard things in computer science ("cache invalidation, and naming things") is a joke, but the underlying point is real. Invalidation in a distributed system introduces a race condition.

The sequence is: write to database, delete from cache. Between the write completing and the cache delete completing, another request reads the old value from cache. This is a short window but it exists. At high request rates, it gets hit constantly.

The more serious race condition is: read misses cache, read goes to database, write updates database, write deletes cache, read completes and populates cache with the now-stale old value. The cache is now wrong and will stay wrong until the next invalidation.

Facebook's solution, described in their 2013 NSDI paper on Memcached at scale, was lease tokens. When a cache miss triggers a database read, the cache issues a lease token for that key. Any subsequent write to that key invalidates the lease. When the database read completes and tries to populate the cache, it presents the lease token. If the lease was invalidated by a write, the population is rejected and the key is left empty, forcing the next reader to fetch fresh data.

Cache Aside vs Read-Through vs Write-Through

There are three main patterns for integrating a cache with a database. They trade different consistency guarantees and operational complexity.

Cache aside (also called look-aside): the application reads from cache, falls back to the database on a miss, and writes to the database and then invalidates the cache on writes. The cache is lazy — it only gets populated when someone reads a key. Simple to implement. Allows the database and cache to be entirely separate systems. Prone to the race conditions described above.

Read-through: the cache sits in front of the database. Applications always read from the cache. On a miss, the cache fetches from the database, stores the result, and returns it. Applications never talk to the database directly for reads. This centralizes the cache-population logic but requires the cache to understand your data model.

Write-through: every write goes to both the cache and the database simultaneously, usually in the same operation. Reads are always cache hits (assuming the object was written at least once). Consistency is strong. The cost is write latency — every write waits for both the cache write and the database write to complete.

Write-behind (also called write-back): writes go to the cache synchronously but to the database asynchronously. Reads are fast, writes are fast, but there is a window where the cache has data that the database does not. If the cache fails before the async write completes, that data is lost. Used when you can tolerate some data loss and need very low write latency.

Eviction Policies

A cache is bounded memory. When it fills, something has to be evicted. The choice of what to evict determines your hit rate in practice.

LRU (Least Recently Used) evicts the entry that has not been accessed for the longest time. The intuition is that recently accessed data is likely to be accessed again. LRU is the default in most caches including Redis.

LFU (Least Frequently Used) evicts the entry that has been accessed the fewest times. Better for workloads with consistent hot keys, but more expensive to track accurately.

TinyLFU, used by Caffeine (the Java cache library that Guava's caches are based on), combines frequency counting with recency via a window. It achieves significantly higher hit rates than LRU on most real workloads while using similar memory.

The difference matters. A cache with 80% hit rate handles 5x the database load as no cache. A cache with 90% hit rate handles 10x. Improving hit rate from 80% to 90% cuts database load in half. The eviction policy is not a minor implementation detail.

Thundering Herd

When a hot key expires or is invalidated, hundreds or thousands of concurrent requests might all miss the cache simultaneously and hit the database. This is the thundering herd problem.

The simplest mitigation is probabilistic early expiration: as a cached item approaches its TTL, start serving some percentage of requests from the database and refreshing the cache, rather than waiting until the TTL expires. This spreads the refresh cost over time instead of concentrating it at the expiration moment.

A more aggressive approach is locking: when a cache miss occurs, one request acquires a lock and fetches from the database. Other requests for the same key wait for the lock to release and then read the freshly populated value. Redis has a common pattern for this using SETNX (set if not exists) as a distributed lock.

What This Track Builds

The Caches track asks you to implement a distributed cache with configurable eviction and a consistency protocol. You will handle the mechanics of distributed key assignment (which cache node owns which key), handle cache misses correctly, and implement TTL-based expiration.

The practical skills transfer directly. Redis, Memcached, Varnish — all of them are implementations of the same basic data structure with different consistency models and eviction policies. Understanding the tradeoffs at the implementation level lets you make better decisions about which cache, which consistency mode, and which eviction policy to use for your specific workload.

Ready to build it? The Caches track builds a working distributed cache. You will implement consistent hashing for key distribution, LRU eviction, TTL expiration, and write invalidation. The same concepts power Redis, Memcached, and every content delivery network.

Build it yourself

Reading about distributed systems is useful. Building them is how you actually learn.

Start the Caches track