Sharad Raj

Embracing the cosmos 🌌

Cache Strategies in Distributed Systems

28 Feb 2026

Basic TTL caching is not enough.

You set a 60-second TTL on your Redis keys. Traffic is smooth. Then one moment…. cache expires…. and your database gets hit by a sudden flood of requests. The system goes down. Users see 503s. You wonder: “What went wrong?”

The problem is simple: many keys expiring at the same time.

The Problem: Same-Time Expiry

When thousands of cache entries share the same TTL, they expire together. At that exact moment, every request that would have been a cache hit becomes a cache miss. All of them hit the database at once.

Think of it like this:

Basic TTL assumes keys expire at different times. In practice, they often don’t. You need ways to spread the load and protect the database.


Strategy 1: TTL Jitter

Problem: All keys expire at the same moment.

Solution: Add random time to the TTL. Instead of TTL = 60s, use TTL = 60s + random(0, 15)s. Keys now expire at 60s, 62s, 65s, 74s…. spread over time instead of all at once.

When to use: Use for any cache. Easy to add, big payoff. Add jitter whenever you set a TTL.


Strategy 2: Probability-Based Early Expiration

Problem: Even with jitter, a popular key can still cause a burst when it expires.

Solution: Before the key expires, start recomputing it probabilistically. As time passes, the probability of early recomputation increases. By the time TTL actually expires, the cache is likely already refreshed.

Concept: Imagine a 60-second TTL. At 50 seconds, you might fetch again with 10% chance. At 55 seconds, 30%. At 59 seconds, 80%. The idea: spread the fetches over the end of the TTL instead of letting them all happen at expiry.

When to use: Hot keys that get read a lot, where you want fresh data but no expiry burst. Keep it simple…. no need for exact math in production; the idea is “fetch earlier, with increasing chance.”


Strategy 3: Mutex / Cache Locking

Problem: When cache expires, 50,000 requests all try to recompute. All 50,000 hit the DB.

Solution: Only one request is allowed to fetch. The first request gets a lock (mutex). The rest wait. The winner fetches from DB, fills the cache, frees the lock. The others then read from the freshly filled cache.

Logic: “Only one request allowed to fetch at a time.”

When to use: Costly fetch (DB query, external API). You want exactly one fetch per expiry, with others waiting for the result.


Strategy 4: Stale-While-Revalidate (SWR)

Problem: Cache miss = user waits for DB. With many users, that means timeouts and 503s.

Solution: Never let the user see a miss. When TTL expires, keep serving the old value to users. In the background, send a single request to refresh the cache. Users might see data that’s a few seconds old, but the system stays up.

This is how CDNs and platforms like Netflix and Hotstar work: stay up over being perfect.

When to use: Content where “slightly old” is fine (scores, product listings, info). Choose serving old data over failing.


Strategy 5: Cache Warming / Pre-Warming

Problem: Cold cache at traffic spike. First wave of users triggers DB load.

Solution: Don’t wait for users. Fill the cache before the spike. Run a script or cron job minutes before the event. When users arrive, the cache is already warm.

When to use: Spikes you know are coming…. product launches, flash sales, live events. Warm the cache before users arrive.


Tradeoffs

Strategy Freshness Speed Same for All Best For
TTL Jitter Same as TTL Fast Catches up later Default, all caches
Probabilistic Early Better Fast Catches up later Hot keys, read-heavy
Mutex Same as TTL Slight wait for some Strong Costly fetch
SWR Slightly old Fastest Catches up later CDN, stay up
Cache Warming Fresh at spike Fastest Strong Known events

You can’t get the best of all three. Choose based on what matters most for your system.


When to Use Which

Scenario Strategy
Default cache setup TTL Jitter
Hot key, avoid expiry spike Probabilistic Early Expiration
Costly DB/API fetch Mutex
CDN, live scores, product listings SWR
Flash sale, product launch, live event Cache Warming
Use more than one for important paths Jitter + Mutex + Warming

Start with TTL Jitter everywhere. Add Mutex for costly keys. Use SWR when staying up matters more than fresh data. Use Cache Warming for known traffic spikes. Use more than one if needed…. they work together.

Basic TTL caching fails when many keys expire together. The fix is to spread expiry (jitter, early fetch by chance), limit fetches (mutex), avoid users seeing a miss (SWR), and load before spikes (cache warming).

Focus on how requests flow, when the DB is hit, and how to keep it safe. The exact way you build it depends on your stack…. Redis, Memcached, CDN…. but the ideas stay the same.