Cache & Caching Strategies

Modern applications like streaming platforms, e-commerce sites, and social media handle millions of requests per second. If every request directly hits the database, the system quickly becomes slow and expensive.

This is why caching is one of the most important concepts in system design.

Caching stores frequently accessed data in a fast storage layer (like Redis or Memcached) so that future requests can be served much faster.

What is Caching?

Before we dive into caching strategies, we need to truly understand what a cache is and why it's orders of magnitude faster than a database. This goes all the way down to hardware.

"A cache is not magic — it's physics. Data stored closer to the CPU, in faster memory, with no network in between."

The Memory Hierarchy — Why Location = Speed

Every computer has a memory hierarchy. The closer memory is to the CPU, the faster it is but also the smaller and more expensive it gets. Understanding this is the foundation of understanding caching.

FASTEST / SMALLEST / MOST EXPENSIVE

────────────────────────────────────────

L1 Cache (CPU on-chip) ~32 KB → ~1 ns (nanosecond!)

L2 Cache (CPU on-chip) ~256 KB → ~5 ns

L3 Cache (CPU shared) ~8 MB → ~20 ns

RAM (Main memory) ~16 GB → ~100 ns

SSD (Local disk) ~1 TB → ~100 µs (microseconds)

Database (Remote disk) ~∞ → ~10 ms+ (milliseconds)

────────────────────────────────────────

SLOWEST / LARGEST / CHEAPEST

A database query travels across the network, hits a disk (SSD or HDD), runs query parsing, index lookups, and returns over the network again. That's why it takes 10–200ms. An in-memory cache like Redis lives in RAM on a nearby server — response time drops to under 1ms. That's a 100–1000x difference.

Why is RAM (Redis) So Much Faster Than a Database?

There are 4 core reasons why an in-memory cache like Redis is dramatically faster than a traditional disk-based database:

1. No Disk I/O
Databases write/read from disk (even SSDs are 1000x slower than RAM).
Redis stores everything in RAM zero disk access on reads.

2. No Network Round Trips to Disk
A DB query: App → Network → DB Server → Disk → Parse → Return
A cache hit: App → Network → Redis RAM → Return

3. No Query Parsing / Execution Planning
SQL databases parse your query, build an execution plan, check indexes, join tables.

Redis does a pure key lookup O(1).

4. Data Format Ready to Serve
Cache stores pre-serialized, pre-computed results.
DB stores raw normalized rows that need joining & processing.

What Exactly is a Cache in Software Systems?

In system design, a cache is a key-value store that sits between your application and your database. The key is usually a string (like "user:42") and the value is the serialized result (like a JSON object). Lookup is a pure hash map operation , O(1) time complexity.

Real caches like Redis are built on the same principle but with single-threaded event loop (no lock contention), advanced data structures (lists, sets, sorted sets), persistence options, replication and clustering built in.

Types of Caches in System Design :

1. In-Process / Local Cache (lives inside your app's memory)
Examples: Guava Cache, Caffeine, HashMap
Speed: Fastest possible (no network). Size: Limited to JVM heap.

2. Distributed Cache (shared across multiple app servers)
Examples: Redis, Memcached, Hazelcast
Speed: ~1ms over network. Size: Scalable to TBs.

3. CDN Cache (edge servers close to users worldwide)
Examples: Cloudflare, AWS CloudFront, Fastly
Speed: <10ms from any location. Caches static assets & pages.

4. Database Query Cache (DB engine caches query results)
Examples: MySQL query cache, PostgreSQL shared_buffers
Speed: Skips disk read for repeated identical queries.

When NOT to Use a Cache

Data changes with every single request
Data must always be 100% real-time accurate (e.g. stock trades)
Each user sees completely unique data (low reuse)
Security-sensitive data that must not be shared across users

Caching Startegies :

There are several caching strategies used in system design depending on data consistency needs, read/write patterns, and performance requirements.

Let's explore the most common ones.

1. Cache Aside (Lazy Loading)

Cache-Aside is the most widely used caching strategy. The application code is responsible for loading data into the cache. The cache does not interact with the database directly.

Example :

public Product getProductById(String productId) {

    Product product = cache.get(productId);

    if (product != null) {
        log.debug("Cache Hit");
        return product;
    }

    log.debug("Cache Miss - Fetching from DB");

    product = database.getProduct(productId);

    cache.put(productId, product);

    return product;
}

Pros

1.Cache only stores what's actually requested

2.Cache failures don't break the app

3.Works great for read-heavy workloads

Cons

1.First request is slow (cache miss)

2.Possibility of stale data

2. Read-Through Cache

Similar to Cache-Aside, but here the cache itself fetches data from the database on a miss the application never talks to the database directly. Think of the cache as a smart proxy.

The key difference from Cache-Aside: the application code is cleaner since it never needs to worry about fetching from DB. But you need a cache library that supports this (like NCache or Ehcache).

Pros

Application code is simple & clean
Consistent data loading logic in one place

Cons

First read is always a miss (same cold start)
Requires cache library with DB integration

3. Write Through Cache

Every time data is written, it goes to both the cache and the database at the same time. The write only succeeds when both are updated. This ensures the cache is always in sync.

Pros

Cache is always consistent with DB
No stale reads after writes
No data loss on cache failure

Cons

Write latency is higher (two writes)
Cache may fill with data that's never read

4. Write-Behind (Write-Back)

Write to the cache first and return immediately. The database is updated asynchronously in the background. This makes writes blazing fast, but introduces a window where cache and DB are out of sync.

Pros

Ultra-fast writes — no DB wait
Reduces DB write load significantly
Can batch multiple writes into one DB call

Cons

Data loss if cache crashes before DB sync
Complex to implement reliably
Inconsistency window is a risk

5. Write-Around Cache

Writes go directly to the database, bypassing the cache completely. Data only enters the cache when it's read (on cache miss). Best when data is written once and rarely or never read again.

Pros

Cache not polluted with write-only data
Great for large, infrequently-read data

Cons

First read after write always misses cache
Higher read latency initially

6. Refresh-Ahead Cache

The cache proactively refreshes data before it expires, based on predicting what will be needed next. No cold starts, no cache misses for hot data. It's like a chef pre-cooking the most popular dishes before the dinner rush.

Pros

Near-zero latency for frequently accessed data
Eliminates thundering herd on expiry

Cons

May refresh data that won't be used (waste)
Requires predicting access patterns accurately

7. Cache Eviction Policies

Cache memory is limited. When it fills up, we need to evict (remove) some entries to make room for new ones. The policy you choose dramatically affects cache performance.Below are the some policy for cache evictions.

LRU — Least Recently Used → Evict the item accessed longest ago
LFU — Least Frequently Used → Evict the least-accessed item ever
FIFO — First In, First Out → Evict the oldest added item
TTL — Time To Live → Evict when expiry timer runs out
MRU — Most Recently Used → Evict the freshest item (rare, special cases)

Conclusion

Caching is a critical building block in scalable system design.

By using the right caching strategy, systems can:

Reduce database load
Improve response time
Scale to millions of users
Lower infrastructure costs

Understanding these caching strategies helps engineers design high-performance distributed systems used by companies like Netflix, Amazon, and Instagram.

Cache & Caching Strategies

What is Caching?

The Memory Hierarchy — Why Location = Speed

Why is RAM (Redis) So Much Faster Than a Database?