Caching Strategies

23 min read

A cache trades freshness for speed: you keep a copy of an expensive-to-produce value close to where it is consumed so the next reader does not pay the full cost. In Node you probably reach for an in-process LRU (lru-cache) and a shared store (ioredis against Redis). Rust’s equivalents are moka for fast in-process caching and the redis crate for a shared, cross-instance cache. This chapter covers both tiers, how to set TTLs, and how to invalidate without leaving stale data behind.

Quick Overview

There are two caches you will almost always combine in a production service. An in-process (L1) cache lives in the application’s own memory: zero network hops, no serialization, but private to one instance and lost on restart. A shared (L2) cache like Redis is reachable by every instance and survives restarts, at the cost of a round-trip and (de)serialization. moka is a concurrent, bounded cache with size-based eviction and time-based expiry, plus built-in request coalescing so a cache stampede cannot fire the same expensive load a hundred times. redis gives you the familiar GET/SET key val EX ttl/DEL surface you already know from ioredis. The hard part is never the storage — it is invalidation: deciding when a cached copy is wrong and removing it everywhere.

TypeScript/JavaScript Example

A typical Node service with a two-tier cache: an in-process lru-cache in front of Redis (ioredis), with TTLs and explicit invalidation on write.

1
// npm install lru-cache ioredis
2
import { LRUCache } from "lru-cache";
3
import Redis from "ioredis";
4

5
interface User {
6
  id: number;
7
  name: string;
8
}
9

10
const redis = new Redis(process.env.REDIS_URL ?? "redis://127.0.0.1:6379");
11

12
// L1: in-process LRU, bounded to 10k entries, each living 60s.
13
const l1 = new LRUCache<string, User>({
14
  max: 10_000,
15
  ttl: 60_000, // milliseconds
16
});
17

18
let dbCalls = 0;
19
async function loadUserFromDb(id: number): Promise<User> {
20
  dbCalls++;
21
  // Pretend this is a slow query.
22
  await new Promise((r) => setTimeout(r, 10));
23
  return { id, name: `user-${id}` };
24
}
25

26
async function getUser(id: number): Promise<User> {
27
  const key = `user:${id}`;
28

29
  // 1. L1 lookup.
30
  const local = l1.get(key);
31
  if (local) return local;
32

33
  // 2. L2 (Redis) lookup.
34
  const cached = await redis.get(key);
35
  if (cached) {
36
    const user = JSON.parse(cached) as User;
37
    l1.set(key, user);
38
    return user;
39
  }
40

41
  // 3. Miss in both tiers: load and back-fill, with a TTL on Redis.
42
  const user = await loadUserFromDb(id);
43
  await redis.set(key, JSON.stringify(user), "EX", 300);
44
  l1.set(key, user);
45
  return user;
46
}
47

48
// On a write, invalidate BOTH tiers so no instance serves stale data.
49
async function invalidateUser(id: number): Promise<void> {
50
  l1.delete(`user:${id}`);
51
  await redis.del(`user:${id}`);
52
}

Key points:

lru-cache bounds memory by entry count and supports a per-cache ttl.
ioredis exposes get / set ... EX / del — the raw Redis command surface.
Cache-aside (a.k.a. lazy loading) is the dominant pattern: read cache, fall through to the source, back-fill.
Two concurrent cache misses for the same key both run loadUserFromDb — lru-cache does not coalesce them.
Invalidation is manual and easy to get wrong: forget one tier and you serve stale data.

Rust Equivalent

The idiomatic in-process cache is moka, which is concurrent (built for multi-threaded async servers), bounded, and TTL-aware. Its standout feature versus lru-cache is read-through with stampede protection: try_get_with runs the loader at most once per key even under a thundering herd. The current stable toolchain is Rust 1.96.0 on the 2024 edition; cargo new selects it automatically.

1
cargo add moka --features future
2
cargo add tokio --features full

1
use std::sync::Arc;
2
use std::sync::atomic::{AtomicU64, Ordering};
3
use std::time::Duration;
4

5
use moka::future::Cache;
6

7
// A "database" that is slow and counts how often it is hit.
8
#[derive(Clone)]
9
struct Db {
10
    calls: Arc<AtomicU64>,
11
}
12

13
impl Db {
14
    async fn load_user(&self, id: u64) -> String {
15
        self.calls.fetch_add(1, Ordering::Relaxed);
16
        // Pretend this is a slow network/database round-trip.
17
        tokio::time::sleep(Duration::from_millis(10)).await;
18
        format!("user-{id}")
19
    }
20
}
21

22
#[tokio::main]
23
async fn main() {
24
    let db = Db { calls: Arc::new(AtomicU64::new(0)) };
25

26
    // A bounded cache: at most 10_000 entries, each living for 60 seconds.
27
    let cache: Cache<u64, String> = Cache::builder()
28
        .max_capacity(10_000)
29
        .time_to_live(Duration::from_secs(60))
30
        .build();
31

32
    // `try_get_with` is the read-through pattern: on a miss it runs the
33
    // closure, stores the result, and — crucially — coalesces concurrent
34
    // callers for the same key so the closure runs at most once.
35
    let load = |id: u64| {
36
        let db = db.clone();
37
        async move { Ok::<_, std::convert::Infallible>(db.load_user(id).await) }
38
    };
39

40
    // First call for key 42: a miss, so the DB is hit.
41
    let a = cache.try_get_with(42, load(42)).await.unwrap();
42
    // Second call: a hit, served from memory, DB untouched.
43
    let b = cache.try_get_with(42, load(42)).await.unwrap();
44

45
    println!("a = {a}");
46
    println!("b = {b}");
47
    println!("db calls = {}", db.calls.load(Ordering::Relaxed));
48

49
    // Explicit invalidation removes a single key.
50
    cache.invalidate(&42).await;
51
    let c = cache.try_get_with(42, load(42)).await.unwrap();
52
    println!("c = {c}");
53
    println!("db calls after invalidate = {}", db.calls.load(Ordering::Relaxed));
54
}

Real output:

1
a = user-42
2
b = user-42
3
db calls = 1
4
c = user-42
5
db calls after invalidate = 2

The second try_get_with(42, ...) was a hit, so db calls stayed at 1. After invalidate(&42), the next read missed and the DB was hit again, bumping the count to 2.

Detailed Explanation

`Cache::builder()` and bounds

1
let cache: Cache<u64, String> = Cache::builder()
2
    .max_capacity(10_000)
3
    .time_to_live(Duration::from_secs(60))
4
    .build();

moka caches are bounded by design. max_capacity sets the maximum number of entries (or a weighted size if you supply a weigher), and moka uses a TinyLFU eviction policy that outperforms a plain LRU on real workloads. This is a deliberate contrast with a naive Map-as-cache, which grows without limit until you run out of memory. The cache is internally Arc-shared, so cache.clone() is cheap — every clone points at the same underlying store, exactly like cloning an Arc.

Note: The future::Cache is Send + Sync and designed to be stored in shared application state (for example an axum State) and cloned into every request handler. You do not wrap it in a Mutex.

TTL vs. TTI

1
.time_to_live(Duration::from_secs(60))  // evict 60s after WRITE
2
.time_to_idle(Duration::from_secs(300)) // evict 300s after last READ

time_to_live (TTL) counts from when an entry was inserted; time_to_idle (TTI) counts from the last access. Use TTL to bound staleness (“this data is never more than 60 seconds old”); use TTI to keep hot keys warm while letting cold ones fall out. You can set both — an entry is evicted when either limit is reached. moka removes expired entries lazily on access and in a background housekeeping pass, so a get of an expired key returns None:

1
use std::time::Duration;
2

3
use moka::future::Cache;
4

5
#[tokio::main]
6
async fn main() {
7
    let cache: Cache<String, i32> = Cache::builder()
8
        .max_capacity(100)
9
        .time_to_live(Duration::from_millis(50))
10
        .build();
11

12
    cache.insert("key".to_string(), 1).await;
13
    println!("right after insert: {:?}", cache.get("key").await);
14

15
    // Wait past the TTL.
16
    tokio::time::sleep(Duration::from_millis(80)).await;
17
    println!("after TTL:        {:?}", cache.get("key").await);
18

19
    cache.run_pending_tasks().await;
20
    println!("entry_count:      {}", cache.entry_count());
21
}

Real output:

1
right after insert: Some(1)
2
after TTL:        None
3
entry_count:      0

Read-through and stampede protection

The single most important moka feature is try_get_with (fallible loader) and get_with (infallible loader). On a miss they run your loader, store the result, and coalesce concurrent callers for the same key: even if a hundred tasks ask for a cold key at once, the loader runs exactly once and the rest await its result. This is the cure for a cache stampede (the “thundering herd” that hammers your database the instant a popular key expires).

1
use std::sync::Arc;
2
use std::sync::atomic::{AtomicU64, Ordering};
3
use std::time::Duration;
4

5
use moka::future::Cache;
6

7
#[tokio::main]
8
async fn main() {
9
    let loads = Arc::new(AtomicU64::new(0));
10
    let cache: Cache<u64, String> = Cache::builder()
11
        .max_capacity(1_000)
12
        .time_to_live(Duration::from_secs(30))
13
        .build();
14

15
    // Fire 50 concurrent requests for the SAME key while the cache is cold.
16
    let mut handles = Vec::new();
17
    for _ in 0..50 {
18
        let cache = cache.clone();
19
        let loads = loads.clone();
20
        handles.push(tokio::spawn(async move {
21
            cache
22
                .try_get_with(7u64, async move {
23
                    // Only ONE task should ever run this block per key.
24
                    loads.fetch_add(1, Ordering::Relaxed);
25
                    tokio::time::sleep(Duration::from_millis(20)).await;
26
                    Ok::<_, std::convert::Infallible>("expensive-result".to_string())
27
                })
28
                .await
29
                .unwrap()
30
        }));
31
    }
32

33
    for h in handles {
34
        h.await.unwrap();
35
    }
36

37
    // Despite 50 concurrent callers, the loader ran exactly once.
38
    println!("loader executions = {}", loads.load(Ordering::Relaxed));
39
}

Real output:

1
loader executions = 1

The Node lru-cache has no equivalent guarantee out of the box; you must add your own in-flight-promise deduplication. moka gives it to you for free.

The shared (Redis) tier

For a cache that every instance shares and that survives restarts, use Redis through the redis crate. The cache-aside flow is identical to the Node version — read cache, fall through to the source, write back with a TTL — but the Redis reply types are statically typed.

1
cargo add redis --features tokio-comp,connection-manager
2
cargo add serde --features derive
3
cargo add serde_json
4
cargo add tokio --features full

1
use std::sync::Arc;
2
use std::sync::atomic::{AtomicU64, Ordering};
3

4
use redis::AsyncCommands;
5
use redis::aio::ConnectionManager;
6
use serde::{Deserialize, Serialize};
7

8
#[derive(Debug, Clone, Serialize, Deserialize)]
9
struct User {
10
    id: u64,
11
    name: String,
12
}
13

14
// Simulated slow data source.
15
struct Db {
16
    calls: AtomicU64,
17
}
18

19
impl Db {
20
    async fn load_user(&self, id: u64) -> User {
21
        self.calls.fetch_add(1, Ordering::Relaxed);
22
        User { id, name: format!("user-{id}") }
23
    }
24
}
25

26
// Cache-aside read-through against Redis with a 60s TTL.
27
async fn get_user(
28
    conn: &mut ConnectionManager,
29
    db: &Db,
30
    id: u64,
31
) -> redis::RedisResult<User> {
32
    let key = format!("user:{id}");
33

34
    // 1. Try the cache.
35
    let cached: Option<String> = conn.get(&key).await?;
36
    if let Some(json) = cached {
37
        return Ok(serde_json::from_str(&json).expect("corrupt cache entry"));
38
    }
39

40
    // 2. Miss: load from the source of truth.
41
    let user = db.load_user(id).await;
42

43
    // 3. Populate the cache with a TTL so stale data self-heals.
44
    let json = serde_json::to_string(&user).expect("serializable");
45
    let _: () = conn.set_ex(&key, json, 60).await?;
46

47
    Ok(user)
48
}
49

50
#[tokio::main]
51
async fn main() -> redis::RedisResult<()> {
52
    let client = redis::Client::open("redis://127.0.0.1:6379/")?;
53
    let mut conn = ConnectionManager::new(client).await?;
54

55
    // Clean slate for a deterministic demo.
56
    let _: () = redis::cmd("FLUSHDB").query_async(&mut conn).await?;
57

58
    let db = Arc::new(Db { calls: AtomicU64::new(0) });
59

60
    let a = get_user(&mut conn, &db, 42).await?; // miss -> DB
61
    let b = get_user(&mut conn, &db, 42).await?; // hit  -> Redis
62
    println!("a = {a:?}");
63
    println!("b = {b:?}");
64
    println!("db calls = {}", db.calls.load(Ordering::Relaxed));
65

66
    // Invalidate on write: delete the key so the next read repopulates.
67
    let _: () = conn.del("user:42").await?;
68
    let c = get_user(&mut conn, &db, 42).await?; // miss again -> DB
69
    println!("c = {c:?}");
70
    println!("db calls after invalidate = {}", db.calls.load(Ordering::Relaxed));
71

72
    Ok(())
73
}

Run against a local Redis (docker run -p 6379:6379 redis), the real output is:

1
a = User { id: 42, name: "user-42" }
2
b = User { id: 42, name: "user-42" }
3
db calls = 1
4
c = User { id: 42, name: "user-42" }
5
db calls after invalidate = 2

A few things to notice:

ConnectionManager is a cheap-to-clone, multiplexed, auto-reconnecting connection. Clone it into each handler instead of opening a new socket per request — that is the production-correct counterpart to an ioredis client (which is also a long-lived multiplexed connection).
set_ex(key, value, 60) maps to Redis SET key value EX 60. The TTL is your safety net: even if you forget to invalidate, the entry self-destructs in 60 seconds, so the worst-case staleness is bounded.
The turbofish-free let _: () = annotations are load-bearing. Redis replies are polymorphic, so you must tell the compiler what type to decode the reply into (see Common Pitfalls).
Values are serialized with serde_json. Redis stores bytes; you choose the encoding (JSON here, but bincode or MessagePack are faster and smaller for internal-only data).

Tip: For high-throughput services, put a connection pool (bb8 or deadpool) in front of Redis rather than a single ConnectionManager, the same way you would size an ioredis cluster client. See the database section for pooling patterns that apply equally to Redis.

Key Differences

Concern	TypeScript / Node	Rust
In-process cache	`lru-cache` (LRU)	`moka` (TinyLFU, concurrent)
Bounding	`max` entries / `ttl`	`max_capacity` + `time_to_live` / `time_to_idle`
Stampede protection	manual in-flight dedup	built into `try_get_with` / `get_with`
Concurrency	single-threaded event loop	true multi-threaded; `moka` is lock-light
Shared cache	`ioredis`	`redis` crate + `ConnectionManager`
Redis reply typing	dynamic (`string \| null`)	static (`Option<String>`, must annotate)
Value requirement	any JS value	`K: Hash + Eq`, `V: Clone` (see pitfalls)
Eviction visibility	mostly opaque	`entry_count`, `run_pending_tasks`, listeners

The deepest conceptual difference is concurrency. Node’s single event loop means an in-process cache never has data races; you just mutate a Map. Rust servers are genuinely multi-threaded, so a cache shared across tasks must be thread-safe. moka is engineered for exactly this — it is internally Arc-shared and uses sharded, mostly lock-free structures, so you clone it freely across tasks without a Mutex. A second difference is type discipline at the Redis boundary: where ioredis hands you string | null and you cast, the redis crate forces you to name the decode target, which catches “I expected a list but got a string” bugs at compile time.

Warning: A cache is shared mutable state. The one thing it must never do is store something whose validity depends on the request that created it (a per-user token, a request-scoped permission). Cache the data, authorize per request. This is a classic cache-poisoning vector — see the security section.

Common Pitfalls

1. Forgetting to annotate Redis reply types

Redis commands are generic over the reply type. If the compiler cannot infer it, you get a confusing error mentioning the never type !:

1
use redis::AsyncCommands;
2
use redis::aio::ConnectionManager;
3

4
#[tokio::main]
5
async fn main() -> redis::RedisResult<()> {
6
    let client = redis::Client::open("redis://127.0.0.1:6379/")?;
7
    let mut conn = ConnectionManager::new(client).await?;
8

9
    // does not compile (E0277): no type tells redis how to decode the reply.
10
    conn.set("k", "v").await?;
11

12
    Ok(())
13
}

The real compiler error:

1
error[E0277]: the trait bound `!: FromRedisValue` is not satisfied
2
    --> src/main.rs:10:10
3
     |
4
  10 |     conn.set("k", "v").await?;
5
     |          ^^^ the trait `FromRedisValue` is not implemented for `!`
6
...
7
     = help: did you intend to use the type `()` here instead?

The fix is to annotate the discarded reply — SET returns OK, which you decode as ():

1
let _: () = conn.set("k", "v").await?;

This trips up nearly every newcomer. The compiler even suggests (); take its advice.

2. Caching a non-`Clone` value

moka hands out a fresh value on every get, so the value type must be Clone. Trying to cache something like a live TCP connection fails to compile:

1
use moka::sync::Cache;
2

3
// A value type that is NOT Clone.
4
struct Connection {
5
    _socket: std::net::TcpStream,
6
}
7

8
fn main() {
9
    // does not compile (E0277): Connection does not implement Clone.
10
    let cache: Cache<u64, Connection> = Cache::builder().max_capacity(10).build();
11
    println!("{:?}", cache.get(&1).is_some());
12
}

The real compiler error:

1
error[E0277]: the trait bound `Connection: Clone` is not satisfied
2
   --> src/main.rs:11:41
3
    |
4
 11 |     let cache: Cache<u64, Connection> = Cache::builder().max_capacity(10).build();
5
    |                                         ^^^^^^^^^^^^^^^^ the trait `Clone` is not implemented for `Connection`
6
...
7
note: required by a bound in `moka::sync::Cache::<K, V>::builder`

For large values, do not pay the deep clone on every hit — wrap the value in Arc<T> and cache Arc<T>. Cloning an Arc is a cheap atomic refcount bump, not a copy of the data. (This is why the real-world example below caches Arc<Product>.)

3. The unbounded “cache” that is actually a memory leak

A HashMap<K, V> you only ever insert into is not a cache — it is a leak. Without an eviction bound it grows until the process is OOM-killed. Always set max_capacity (and usually a TTL) on a moka cache, and always set an EX on Redis keys. An entry with no expiry is a promise to remember it forever.

4. Invalidating only one tier

With a two-tier cache, a write that deletes the Redis key but leaves the L1 copy in every instance’s memory will serve stale data for up to the L1 TTL. Either keep L1 TTLs short (seconds, not minutes) so staleness self-heals, or publish invalidation events (Redis pub/sub) that each instance subscribes to and uses to clear its L1. Short L1 TTL is simpler and usually good enough.

5. Caching errors and `None`s by accident

If your loader can fail, decide deliberately whether to cache the failure. try_get_with does not cache an Err — the next call retries the loader, which is usually what you want for transient errors. But if you cache an Option and store None on a miss, you have implemented negative caching, which protects you from a flood of lookups for keys that do not exist. Make that choice on purpose, and give negative entries a shorter TTL than positive ones (a missing record may appear at any moment).

Best Practices

Always bound the cache. max_capacity plus a TTL/TTI on moka; EX on every Redis key. Treat an unbounded cache as a bug.
Use try_get_with / get_with for read-through, not a manual get-then-insert. You get stampede protection and correct concurrent behavior for free.
Cache Arc<T> for large values so a hit is a refcount bump, not a deep clone.
Bound staleness with TTL, not just invalidation. Invalidation is best-effort; the TTL is the guarantee. Pick the longest staleness your product can tolerate and set the TTL to that.
Give the cache its own type. Wrap the L1/L2 logic in a struct with get / invalidate methods so call sites cannot accidentally read one tier and forget the other.

Per-entry TTLs via the Expiry trait when different keys need different lifetimes (hot config vs. rarely-changing reference data):

1
use std::time::{Duration, Instant};
2

3
use moka::Expiry;
4
use moka::sync::Cache;
5

6
// A cached value that carries its own desired lifetime.
7
#[derive(Clone)]
8
struct Cached {
9
    value: String,
10
    ttl: Duration,
11
}
12

13
// Implement per-entry expiration: each entry decides its own TTL.
14
struct PerEntryExpiry;
15

16
impl Expiry<String, Cached> for PerEntryExpiry {
17
    fn expire_after_create(
18
        &self,
19
        _key: &String,
20
        value: &Cached,
21
        _created_at: Instant,
22
    ) -> Option<Duration> {
23
        Some(value.ttl)
24
    }
25
}
26

27
fn main() {
28
    let cache: Cache<String, Cached> = Cache::builder()
29
        .max_capacity(1_000)
30
        .expire_after(PerEntryExpiry)
31
        .build();
32

33
    cache.insert(
34
        "short".to_string(),
35
        Cached { value: "a".into(), ttl: Duration::from_millis(30) },
36
    );
37
    cache.insert(
38
        "long".to_string(),
39
        Cached { value: "b".into(), ttl: Duration::from_secs(60) },
40
    );
41

42
    std::thread::sleep(Duration::from_millis(50));
43
    cache.run_pending_tasks();
44

45
    // Read the values back so the field is actually used.
46
    println!("short present: {}", cache.get("short").is_some());
47
    if let Some(c) = cache.get("long") {
48
        println!("long still holds: {}", c.value);
49
    }
50
}

Real output:

1
short present: false
2
long still holds: b

Pick the right moka flavor. Use moka::future::Cache inside an async (tokio) server; use moka::sync::Cache (the sync feature) for synchronous or CPU-bound code with no runtime.
Choose a compact serialization for the L2 tier. JSON is debuggable; bincode/MessagePack are faster and smaller for internal-only data you never read by hand.

Real-World Example

A production-flavored two-tier cache: a fast per-process moka L1 in front of a shared Redis L2, fronting a repository. L1 holds Arc<Product> so hits are cheap, Redis holds JSON so every instance can share entries, and invalidate clears both tiers on a write. This is the shape you would store in an axum State and call from handlers.

1
cargo add moka --features future
2
cargo add redis --features tokio-comp,connection-manager
3
cargo add serde --features derive
4
cargo add serde_json
5
cargo add tokio --features full

1
use std::sync::Arc;
2
use std::sync::atomic::{AtomicU64, Ordering};
3
use std::time::Duration;
4

5
use moka::future::Cache;
6
use redis::AsyncCommands;
7
use redis::aio::ConnectionManager;
8
use serde::{Deserialize, Serialize};
9

10
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
11
struct Product {
12
    id: u64,
13
    name: String,
14
    price_cents: u64,
15
}
16

17
// The source of truth (a database, an upstream API, ...).
18
struct Repo {
19
    db_hits: AtomicU64,
20
}
21

22
impl Repo {
23
    async fn fetch(&self, id: u64) -> Product {
24
        self.db_hits.fetch_add(1, Ordering::Relaxed);
25
        Product { id, name: format!("Widget {id}"), price_cents: 999 + id }
26
    }
27
}
28

29
// A two-tier cache: a fast per-process L1 (moka) backed by a shared L2 (Redis).
30
#[derive(Clone)]
31
struct ProductCache {
32
    l1: Cache<u64, Arc<Product>>,
33
    redis: ConnectionManager,
34
    repo: Arc<Repo>,
35
}
36

37
impl ProductCache {
38
    fn key(id: u64) -> String {
39
        format!("product:{id}")
40
    }
41

42
    async fn get(&self, id: u64) -> Arc<Product> {
43
        // L1: in-process, no network, no serialization.
44
        if let Some(hit) = self.l1.get(&id).await {
45
            return hit;
46
        }
47

48
        // L2: shared Redis. Many app instances can reuse the same entry.
49
        let mut redis = self.redis.clone();
50
        let cached: Option<String> = redis.get(Self::key(id)).await.unwrap_or(None);
51
        if let Some(json) = cached {
52
            if let Ok(p) = serde_json::from_str::<Product>(&json) {
53
                let arc = Arc::new(p);
54
                self.l1.insert(id, arc.clone()).await;
55
                return arc;
56
            }
57
        }
58

59
        // Miss in both tiers: load from the source and back-fill both caches.
60
        let product = self.repo.fetch(id).await;
61
        let json = serde_json::to_string(&product).expect("serializable");
62
        let _: Result<(), _> = redis.set_ex(Self::key(id), json, 300).await;
63

64
        let arc = Arc::new(product);
65
        self.l1.insert(id, arc.clone()).await;
66
        arc
67
    }
68

69
    // Invalidate both tiers on a write so no instance serves stale data.
70
    async fn invalidate(&self, id: u64) {
71
        self.l1.invalidate(&id).await;
72
        let mut redis = self.redis.clone();
73
        let _: Result<(), _> = redis.del(Self::key(id)).await;
74
    }
75
}
76

77
#[tokio::main]
78
async fn main() -> redis::RedisResult<()> {
79
    let client = redis::Client::open("redis://127.0.0.1:6379/")?;
80
    let mut conn = ConnectionManager::new(client).await?;
81
    let _: () = redis::cmd("FLUSHDB").query_async(&mut conn).await?;
82

83
    let cache = ProductCache {
84
        l1: Cache::builder()
85
            .max_capacity(50_000)
86
            .time_to_live(Duration::from_secs(60))
87
            .build(),
88
        redis: conn,
89
        repo: Arc::new(Repo { db_hits: AtomicU64::new(0) }),
90
    };
91

92
    let p1 = cache.get(7).await; // miss both -> DB
93
    let p2 = cache.get(7).await; // L1 hit
94
    println!("p1 == p2: {}", p1 == p2);
95
    println!("db hits after two gets: {}", cache.repo.db_hits.load(Ordering::Relaxed));
96

97
    // Simulate a second process: clear L1 only, Redis still has the value.
98
    cache.l1.invalidate(&7).await;
99
    let p3 = cache.get(7).await; // L1 miss, L2 (Redis) hit -> no DB call
100
    println!("p3 == p1: {}", p3 == p1);
101
    println!("db hits after L1 eviction: {}", cache.repo.db_hits.load(Ordering::Relaxed));
102

103
    // Write-path invalidation clears both tiers.
104
    cache.invalidate(7).await;
105
    let _ = cache.get(7).await; // miss both -> DB again
106
    println!("db hits after invalidate: {}", cache.repo.db_hits.load(Ordering::Relaxed));
107

108
    Ok(())
109
}

Run against a local Redis, the real output is:

1
p1 == p2: true
2
db hits after two gets: 1
3
p3 == p1: true
4
db hits after L1 eviction: 1
5
db hits after invalidate: 2

This proves the tiers: two reads hit the DB once (L1 absorbs the second). After evicting L1 (simulating a fresh instance or a restart), the read is served from Redis with no DB hit — the count stays at 1. Only after invalidating both tiers does the next read fall through to the DB again, bumping the count to 2. The whole ProductCache is Clone and Send + Sync, so you store one in axum State and clone it into every handler — see the web APIs section for wiring shared state.

Exercises

Exercise 1: Memoize an expensive computation

Difficulty: Beginner

Objective: Use a moka::future::Cache to compute a value once and serve repeat requests from memory.

Instructions: Build a cache keyed by u64. Use get_with to compute fib(n) (iteratively) on a miss, incrementing a shared counter each time the loader actually runs. Request the same key three times and prove the loader ran exactly once.

Solution

1
// cargo add moka --features future
2
// cargo add tokio --features full
3
use std::sync::Arc;
4
use std::sync::atomic::{AtomicU64, Ordering};
5
use std::time::Duration;
6

7
use moka::future::Cache;
8

9
#[tokio::main]
10
async fn main() {
11
    let computations = Arc::new(AtomicU64::new(0));
12

13
    let cache: Cache<u64, u64> = Cache::builder()
14
        .max_capacity(1_000)
15
        .time_to_live(Duration::from_secs(600))
16
        .build();
17

18
    async fn slow_fib(n: u64) -> u64 {
19
        match n {
20
            0 => 0,
21
            1 => 1,
22
            _ => {
23
                let (mut a, mut b) = (0u64, 1u64);
24
                for _ in 2..=n {
25
                    (a, b) = (b, a + b);
26
                }
27
                b
28
            }
29
        }
30
    }
31

32
    let mut last = 0;
33
    // Ask for fib(90) three times; only the first should compute it.
34
    for _ in 0..3 {
35
        let computations = computations.clone();
36
        last = cache
37
            .get_with(90u64, async move {
38
                computations.fetch_add(1, Ordering::Relaxed);
39
                slow_fib(90).await
40
            })
41
            .await;
42
    }
43

44
    println!("fib(90) = {last}");
45
    println!("computations = {}", computations.load(Ordering::Relaxed));
46
}

Real output:

1
fib(90) = 2880067194370816120
2
computations = 1

Exercise 2: Negative caching with per-entry TTLs

Difficulty: Intermediate

Objective: Cache “not found” results with a shorter TTL than successful results, using the Expiry trait.

Instructions: Define an enum Entry { Hit(String), Miss }. Implement Expiry so Hit lives 300 seconds and Miss lives only 5 seconds. Insert one of each, then read them back and prove a Miss is stored (negative caching) so the next lookup of a known-missing key does not hit the backend.

Solution

1
// cargo add moka --features sync
2
use std::time::{Duration, Instant};
3

4
use moka::Expiry;
5
use moka::sync::Cache;
6

7
#[derive(Clone, Debug)]
8
enum Entry {
9
    Hit(String),
10
    // A cached "not found" so repeated lookups of a missing key don't
11
    // hammer the backend (negative caching).
12
    Miss,
13
}
14

15
struct TieredExpiry;
16

17
impl Expiry<u64, Entry> for TieredExpiry {
18
    fn expire_after_create(
19
        &self,
20
        _key: &u64,
21
        value: &Entry,
22
        _created_at: Instant,
23
    ) -> Option<Duration> {
24
        match value {
25
            Entry::Hit(_) => Some(Duration::from_secs(300)), // real data: 5 min
26
            Entry::Miss => Some(Duration::from_secs(5)),      // misses: short TTL
27
        }
28
    }
29
}
30

31
fn main() {
32
    let cache: Cache<u64, Entry> = Cache::builder()
33
        .max_capacity(10_000)
34
        .expire_after(TieredExpiry)
35
        .build();
36

37
    cache.insert(1, Entry::Hit("found".to_string()));
38
    cache.insert(2, Entry::Miss);
39

40
    if let Some(Entry::Hit(v)) = cache.get(&1) {
41
        println!("key 1 -> {v}");
42
    }
43
    println!(
44
        "key 2 is negatively cached: {}",
45
        matches!(cache.get(&2), Some(Entry::Miss))
46
    );
47
}

Real output:

1
key 1 -> found
2
key 2 is negatively cached: true

Exercise 3: Conditional Redis write with `SET NX EX`

Difficulty: Advanced

Objective: Use Redis’s atomic SET ... NX EX to implement a “set only if absent, with TTL” — the building block for a distributed lock or request-dedup key.

Instructions: Open a ConnectionManager to a local Redis. Use a raw SET key value NX EX 30 command and decode the reply as Option<String> (it is Some("OK") on success, None when the key already exists). Acquire the key from “worker-a”, then attempt to acquire it from “worker-b” and show the second attempt fails while the first holder remains.

Solution

1
// cargo add redis --features tokio-comp,connection-manager
2
// cargo add tokio --features full
3
use redis::AsyncCommands;
4
use redis::aio::ConnectionManager;
5

6
#[tokio::main]
7
async fn main() -> redis::RedisResult<()> {
8
    let client = redis::Client::open("redis://127.0.0.1:6379/")?;
9
    let mut conn = ConnectionManager::new(client).await?;
10
    let _: () = redis::cmd("FLUSHDB").query_async(&mut conn).await?;
11

12
    // SET key value NX EX 30: atomically set only if absent, with a 30s TTL.
13
    // The reply is the string "OK" on success or nil when the key existed.
14
    let first: Option<String> = redis::cmd("SET")
15
        .arg("lock:order:7")
16
        .arg("worker-a")
17
        .arg("NX")
18
        .arg("EX")
19
        .arg(30)
20
        .query_async(&mut conn)
21
        .await?;
22
    println!("first acquire: {first:?}");
23

24
    // A second worker cannot take the lock until it expires or is released.
25
    let second: Option<String> = redis::cmd("SET")
26
        .arg("lock:order:7")
27
        .arg("worker-b")
28
        .arg("NX")
29
        .arg("EX")
30
        .arg(30)
31
        .query_async(&mut conn)
32
        .await?;
33
    println!("second acquire: {second:?}");
34

35
    let holder: String = conn.get("lock:order:7").await?;
36
    println!("lock held by: {holder}");
37

38
    Ok(())
39
}

Real output:

1
first acquire: Some("OK")
2
second acquire: None
3
lock held by: worker-a

This SET NX EX is the kernel of a simple distributed lock and of request deduplication for background jobs — a job that should run at most once writes a unique key with NX before starting.

Caching Strategies

Quick Overview

TypeScript/JavaScript Example

Rust Equivalent

Detailed Explanation

Cache::builder() and bounds

TTL vs. TTI

Read-through and stampede protection

The shared (Redis) tier

Key Differences

Common Pitfalls

1. Forgetting to annotate Redis reply types

2. Caching a non-Clone value

3. The unbounded “cache” that is actually a memory leak

4. Invalidating only one tier

5. Caching errors and Nones by accident

Best Practices

Real-World Example

Further Reading

Exercises

Exercise 1: Memoize an expensive computation

Exercise 2: Negative caching with per-entry TTLs

Exercise 3: Conditional Redis write with SET NX EX

`Cache::builder()` and bounds

2. Caching a non-`Clone` value

5. Caching errors and `None`s by accident

Exercise 3: Conditional Redis write with `SET NX EX`