Cache Strategies in Distributed Systems

28 Feb 2026

Basic TTL caching is not enough.

You set a 60-second TTL on your Redis keys. Traffic is smooth. Then one moment…. cache expires…. and your database gets hit by a sudden flood of requests. The system goes down. Users see 503s. You wonder: “What went wrong?”

The problem is simple: many keys expiring at the same time.

The Problem: Same-Time Expiry

When thousands of cache entries share the same TTL, they expire together. At that exact moment, every request that would have been a cache hit becomes a cache miss. All of them hit the database at once.

Think of it like this:

Netflix drops a new season at midnight. Millions of users hit “Play” at 00:00:01. The metadata for Episode 1 is cached with a 5-minute TTL. At 00:05:01, the cache expires. Every user refreshing gets a miss. The origin server gets hit hard.
E-commerce flash sale at 12:00 PM. Product listings are cached for 2 minutes. At 12:02:01, all product keys expire together. Thousands of “Add to Cart” requests skip the cache and hit the DB. Checkout breaks.
IPL live on Hotstar. The score is cached for 30 seconds. At the 30-second mark, millions of users refresh. Cache miss for all. The score API and DB collapse.

Basic TTL assumes keys expire at different times. In practice, they often don’t. You need ways to spread the load and protect the database.

Strategy 1: TTL Jitter

Problem: All keys expire at the same moment.

Solution: Add random time to the TTL. Instead of TTL = 60s, use TTL = 60s + random(0, 15)s. Keys now expire at 60s, 62s, 65s, 74s…. spread over time instead of all at once.

When to use: Use for any cache. Easy to add, big payoff. Add jitter whenever you set a TTL.

Strategy 2: Probability-Based Early Expiration

Problem: Even with jitter, a popular key can still cause a burst when it expires.

Solution: Before the key expires, start recomputing it probabilistically. As time passes, the probability of early recomputation increases. By the time TTL actually expires, the cache is likely already refreshed.

Concept: Imagine a 60-second TTL. At 50 seconds, you might fetch again with 10% chance. At 55 seconds, 30%. At 59 seconds, 80%. The idea: spread the fetches over the end of the TTL instead of letting them all happen at expiry.

When to use: Hot keys that get read a lot, where you want fresh data but no expiry burst. Keep it simple…. no need for exact math in production; the idea is “fetch earlier, with increasing chance.”

Strategy 3: Mutex / Cache Locking

Problem: When cache expires, 50,000 requests all try to recompute. All 50,000 hit the DB.

Solution: Only one request is allowed to fetch. The first request gets a lock (mutex). The rest wait. The winner fetches from DB, fills the cache, frees the lock. The others then read from the freshly filled cache.

Logic: “Only one request allowed to fetch at a time.”

When to use: Costly fetch (DB query, external API). You want exactly one fetch per expiry, with others waiting for the result.

Strategy 4: Stale-While-Revalidate (SWR)

Problem: Cache miss = user waits for DB. With many users, that means timeouts and 503s.

Solution: Never let the user see a miss. When TTL expires, keep serving the old value to users. In the background, send a single request to refresh the cache. Users might see data that’s a few seconds old, but the system stays up.

This is how CDNs and platforms like Netflix and Hotstar work: stay up over being perfect.

When to use: Content where “slightly old” is fine (scores, product listings, info). Choose serving old data over failing.

Strategy 5: Cache Warming / Pre-Warming

Problem: Cold cache at traffic spike. First wave of users triggers DB load.

Solution: Don’t wait for users. Fill the cache before the spike. Run a script or cron job minutes before the event. When users arrive, the cache is already warm.

Netflix release: Warm up info and CDN edges before midnight.
E-commerce sale: Load sale products into cache before 12:00 PM.
IPL match: Warm score and match info before the first ball.

When to use: Spikes you know are coming…. product launches, flash sales, live events. Warm the cache before users arrive.

Tradeoffs

Strategy	Freshness	Speed	Same for All	Best For
TTL Jitter	Same as TTL	Fast	Catches up later	Default, all caches
Probabilistic Early	Better	Fast	Catches up later	Hot keys, read-heavy
Mutex	Same as TTL	Slight wait for some	Strong	Costly fetch
SWR	Slightly old	Fastest	Catches up later	CDN, stay up
Cache Warming	Fresh at spike	Fastest	Strong	Known events

Freshness: How new is the data?
Speed: How fast does the user get a response?
Same for All: Do all users see the same value at the same time?

You can’t get the best of all three. Choose based on what matters most for your system.

When to Use Which

Scenario	Strategy
Default cache setup	TTL Jitter
Hot key, avoid expiry spike	Probabilistic Early Expiration
Costly DB/API fetch	Mutex
CDN, live scores, product listings	SWR
Flash sale, product launch, live event	Cache Warming
Use more than one for important paths	Jitter + Mutex + Warming

Start with TTL Jitter everywhere. Add Mutex for costly keys. Use SWR when staying up matters more than fresh data. Use Cache Warming for known traffic spikes. Use more than one if needed…. they work together.

Basic TTL caching fails when many keys expire together. The fix is to spread expiry (jitter, early fetch by chance), limit fetches (mutex), avoid users seeing a miss (SWR), and load before spikes (cache warming).

Focus on how requests flow, when the DB is hit, and how to keep it safe. The exact way you build it depends on your stack…. Redis, Memcached, CDN…. but the ideas stay the same.