Cache Strategies in Distributed Systems
28 Feb 2026
Basic TTL caching is not enough.
You set a 60-second TTL on your Redis keys. Traffic is smooth. Then one moment…. cache expires…. and your database gets hit by a sudden flood of requests. The system goes down. Users see 503s. You wonder: “What went wrong?”
The problem is simple: many keys expiring at the same time.
The Problem: Same-Time Expiry
When thousands of cache entries share the same TTL, they expire together. At that exact moment, every request that would have been a cache hit becomes a cache miss. All of them hit the database at once.
Think of it like this:
-
Netflix drops a new season at midnight. Millions of users hit “Play” at 00:00:01. The metadata for Episode 1 is cached with a 5-minute TTL. At 00:05:01, the cache expires. Every user refreshing gets a miss. The origin server gets hit hard.
-
E-commerce flash sale at 12:00 PM. Product listings are cached for 2 minutes. At 12:02:01, all product keys expire together. Thousands of “Add to Cart” requests skip the cache and hit the DB. Checkout breaks.
-
IPL live on Hotstar. The score is cached for 30 seconds. At the 30-second mark, millions of users refresh. Cache miss for all. The score API and DB collapse.
Basic TTL assumes keys expire at different times. In practice, they often don’t. You need ways to spread the load and protect the database.
Strategy 1: TTL Jitter
Problem: All keys expire at the same moment.
Solution: Add random time to the TTL. Instead of TTL = 60s, use TTL = 60s + random(0, 15)s. Keys now expire at 60s, 62s, 65s, 74s…. spread over time instead of all at once.
When to use: Use for any cache. Easy to add, big payoff. Add jitter whenever you set a TTL.
Strategy 2: Probability-Based Early Expiration
Problem: Even with jitter, a popular key can still cause a burst when it expires.
Solution: Before the key expires, start recomputing it probabilistically. As time passes, the probability of early recomputation increases. By the time TTL actually expires, the cache is likely already refreshed.
Concept: Imagine a 60-second TTL. At 50 seconds, you might fetch again with 10% chance. At 55 seconds, 30%. At 59 seconds, 80%. The idea: spread the fetches over the end of the TTL instead of letting them all happen at expiry.
When to use: Hot keys that get read a lot, where you want fresh data but no expiry burst. Keep it simple…. no need for exact math in production; the idea is “fetch earlier, with increasing chance.”
Strategy 3: Mutex / Cache Locking
Problem: When cache expires, 50,000 requests all try to recompute. All 50,000 hit the DB.
Solution: Only one request is allowed to fetch. The first request gets a lock (mutex). The rest wait. The winner fetches from DB, fills the cache, frees the lock. The others then read from the freshly filled cache.
Logic: “Only one request allowed to fetch at a time.”
When to use: Costly fetch (DB query, external API). You want exactly one fetch per expiry, with others waiting for the result.
Strategy 4: Stale-While-Revalidate (SWR)
Problem: Cache miss = user waits for DB. With many users, that means timeouts and 503s.
Solution: Never let the user see a miss. When TTL expires, keep serving the old value to users. In the background, send a single request to refresh the cache. Users might see data that’s a few seconds old, but the system stays up.
This is how CDNs and platforms like Netflix and Hotstar work: stay up over being perfect.
When to use: Content where “slightly old” is fine (scores, product listings, info). Choose serving old data over failing.
Strategy 5: Cache Warming / Pre-Warming
Problem: Cold cache at traffic spike. First wave of users triggers DB load.
Solution: Don’t wait for users. Fill the cache before the spike. Run a script or cron job minutes before the event. When users arrive, the cache is already warm.
- Netflix release: Warm up info and CDN edges before midnight.
- E-commerce sale: Load sale products into cache before 12:00 PM.
- IPL match: Warm score and match info before the first ball.
When to use: Spikes you know are coming…. product launches, flash sales, live events. Warm the cache before users arrive.
Tradeoffs
| Strategy | Freshness | Speed | Same for All | Best For |
|---|---|---|---|---|
| TTL Jitter | Same as TTL | Fast | Catches up later | Default, all caches |
| Probabilistic Early | Better | Fast | Catches up later | Hot keys, read-heavy |
| Mutex | Same as TTL | Slight wait for some | Strong | Costly fetch |
| SWR | Slightly old | Fastest | Catches up later | CDN, stay up |
| Cache Warming | Fresh at spike | Fastest | Strong | Known events |
- Freshness: How new is the data?
- Speed: How fast does the user get a response?
- Same for All: Do all users see the same value at the same time?
You can’t get the best of all three. Choose based on what matters most for your system.
When to Use Which
| Scenario | Strategy |
|---|---|
| Default cache setup | TTL Jitter |
| Hot key, avoid expiry spike | Probabilistic Early Expiration |
| Costly DB/API fetch | Mutex |
| CDN, live scores, product listings | SWR |
| Flash sale, product launch, live event | Cache Warming |
| Use more than one for important paths | Jitter + Mutex + Warming |
Start with TTL Jitter everywhere. Add Mutex for costly keys. Use SWR when staying up matters more than fresh data. Use Cache Warming for known traffic spikes. Use more than one if needed…. they work together.
Basic TTL caching fails when many keys expire together. The fix is to spread expiry (jitter, early fetch by chance), limit fetches (mutex), avoid users seeing a miss (SWR), and load before spikes (cache warming).
Focus on how requests flow, when the DB is hit, and how to keep it safe. The exact way you build it depends on your stack…. Redis, Memcached, CDN…. but the ideas stay the same.