Caches

Intermediate

Scalability|5 tasks

Learn how caching improves read performance in distributed systems. Build request node caches, global caches, and distributed caches with consistent hashing. Master cache eviction and invalidation strategies.

Tasks

TASK 1

intermediate

Implement Request Node Cache

Implement a local cache at each request-handling node. When a request comes in: 1. Check if the key exists in the local cache 2. If cache hit and not...

cachinglocal cacheTTL

TASK 2

intermediate

Build Global Cache

Implement a shared cache accessible by all nodes. Instead of each node maintaining its own cache, a dedicated cache server handles all cache operation...

shared cachecache coherencesingle point of truth

TASK 3

intermediate

Implement Distributed Cache with Consistent Hashing

Distribute cache entries across multiple cache nodes using consistent hashing. This scales cache capacity beyond a single node while maintaining effic...

consistent hashingpartitioninghorizontal scaling

TASK 4

intermediate

Add Eviction Strategies (LRU, TTL)

Implement cache eviction policies to manage limited memory. When the cache is full and a new entry must be added, an eviction policy decides what to r...

LRUTTLeviction+1 more

TASK 5

advanced

Handle Cache Invalidation and Consistency

Implement cache invalidation strategies to maintain consistency between cache and database. When data changes, cached copies must be updated or remove...

invalidationconsistencywrite-through+1 more

Interview Prep

Common interview questions for Backend / Infrastructure Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.

Model Answer

Cache-aside: read hits cache; miss fetches from DB and populates cache; writes update DB and invalidate cache. Race condition: read misses, fetches from DB, write updates DB and deletes cache key, read populates cache with now-stale value. Mitigation: (1) TTL-based expiry as a backstop, (2) Lease tokens (Facebook's approach): cache issues a token on miss; write invalidates token; if token expired when read tries to populate, it aborts population. (3) Shorter TTLs for frequently-written keys.

Model Answer

Scenario: a popular cached item (trending article, flash sale product) expires. Thousands of requests simultaneously miss the cache and hit the database. The DB is overwhelmed; many requests time out; the system appears to go down even though the underlying data is fine. Solutions: (1) Probabilistic early expiration: as TTL approaches, a percentage of requests refresh the cache before expiry, spreading the load, (2) Lock-based: first miss acquires a lock and fetches from DB; others wait and read the freshly populated value, (3) Stale-while-revalidate: serve stale content while background refresh happens, (4) Longer TTLs with explicit invalidation instead of time-based expiry.

Model Answer

Without resilience: all requests hit the DB, which may be sized for cache-hit rates (e.g., only handling 10% of traffic directly). DB becomes overloaded, requests time out, cascading failure. With resilience: circuit breaker opens after Redis failures exceed threshold, routing all traffic to DB with a reduced feature set. Cache should be a performance optimization, not a reliability dependency for correct operation. Design for cache failure: implement request coalescing, DB connection pool limits, and graceful degradation. Use Redis Sentinel or Cluster for high availability.

Model Answer

Write-through: every write goes to both cache and DB synchronously. Cache is always consistent with DB. Higher write latency (two writes per operation). Safe — no data loss on cache failure. Use when read/write ratio is balanced and you cannot tolerate stale reads. Write-back (write-behind): writes go to cache immediately; DB write is deferred/async. Lower write latency. Risk: data in cache but not DB is lost if cache fails. Use when write throughput is critical and you can tolerate some data loss (metrics, analytics, IoT sensor data). Not suitable for financial data.

Model Answer

Fixed window: INCR key; if count > limit, reject; set TTL = window duration on first INCR. Problem: boundary burst (2x limit at window boundary). Sliding window with sorted set: ZADD key timestamp member; ZREMRANGEBYSCORE key 0 (now - window); ZCARD key; if count < limit, add and proceed. Atomic in Lua script or pipelining. Token bucket with Redis: store (tokens, last_refill_time); on each request, compute tokens to add since last refill, cap at max, subtract 1 if available. Implement with EVALSHA for atomicity. Cloudflare's production rate limiter uses a fixed window with a probabilistic correction for the boundary problem.

Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.

Common Mistakes

The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.

Why it happens

Every request finds the key missing, decides to fetch from origin, and all fire at once before any one of them has repopulated the cache.

The fix

Use probabilistic early expiration (PER) or a lock-based approach: the first miss acquires a lock and fetches the value; other misses wait for the lock holder to populate the cache.

Why it happens

A pure LRU treats a one-time sequential scan the same as a repeated hot key. The scan promotes millions of cold entries to the top of the LRU list, flushing the actual hot set.

The fix

Use LRU-K (LRU based on the K-th most recent access) or a segmented LRU (SLRU) that requires an entry to be accessed twice before it enters the protected segment.

Why it happens

Invalidating a cache entry shared by many clients causes all of them to miss simultaneously and race to reload it.

The fix

Use a "fetch-and-set" atomic primitive (e.g., `SET NX PX` in Redis) to ensure only one client fetches from the origin. Others should serve a slightly stale value while the reload is in flight.

Why it happens

If TTLs are never set or set to very large values, keys accumulate without bound.

The fix

Every key must have a TTL or the cache must enforce a maximum capacity with an eviction policy. Prefer short TTLs for mutable data and longer TTLs for immutable data.

Why it happens

In a write-through design, the write must succeed on both the cache and the database. If the database write fails after the cache write succeeds, the cache holds an uncommitted value.

The fix

Write to the database first, then invalidate (not update) the cache. This "cache-aside" or "write-invalidate" pattern ensures the cache never holds a value the database does not have.

Comparison Mode

Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.

Dimension	Write-Through	Write-Back	Write-Around
How it works	Write to cache and backing store synchronously	Write to cache only; flush to backing store later	Write directly to backing store, bypass cache
Write latency	High (waits for backing store)	Low (write to cache only)	Medium (backing store latency, no cache overhead)
Data loss risk	None	High — dirty cache lost on crash	None
Read-after-write from cache	Yes	Yes	No — first read is a cache miss
Cache pollution from write-once data	Yes	Yes	No — large writes do not pollute the cache
Best for	Read-heavy, consistency-critical data	Write-heavy workloads (DB buffer pools)	Large batch writes, log storage

Verdict:Write-through for safety. Write-back for write-heavy throughput with durability handled at the storage layer. Write-around when written data is rarely re-read.

Dimension	LRU	LFU	ARC
Evicts	Least recently used entry	Least frequently used entry	Dynamically balances recency vs frequency
Implementation complexity	Simple (doubly-linked list + hashmap)	Complex (frequency buckets)	Complex (four LRU lists)
Handles scan resistance	No — sequential scan evicts hot entries	Yes — hot entries have high frequency	Yes — ghost lists absorb scan traffic
Cold start	Good — recency bootstraps quickly	Poor — all new entries have frequency = 1	Good — recency arm covers cold start
Adapts to access pattern changes	Yes (recent entries favoured)	Slow — old high-frequency entries stick around	Yes — adapts T1/T2 split dynamically
Used in	Redis (allkeys-lru), most caches	Redis (allkeys-lfu, Redis 4.0+)	ZFS ARC, IBM storage systems

Verdict:LRU is the default and works well for most workloads. LFU or ARC when you have mixed hot/cold access patterns and scan traffic that pollutes the LRU list.

Concepts Covered

cachinglocal cacheTTLshared cachecache coherencesingle point of truthconsistent hashingpartitioninghorizontal scalingLRUevictioncache capacityinvalidationconsistencywrite-throughwrite-behind

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.