Recipe 12: Content Delivery With Automatic Popularity-Based Scaling

Situation

CDNs and caches face a skewed distribution:

a small set of objects are extremely hot
most are cold

Classic cache eviction policies (LRU variants) require maintaining a global recency structure and doing work on every access. They are powerful, but they add complexity and overhead.

In a lease-based model, you can use TTL as a built-in eviction boundary:

objects that remain popular get renewed
objects that become cold expire naturally

This recipe focuses on TTL-as-eviction. “Flash crowd scale-out” is handled in Recipe 17.

What You Build

A cache where each object is stored in a TTL-scoped key-value entry:

read hit: TTL refreshed automatically on access
read miss: fetch from origin and insert with a TTL
cold objects: stop being accessed and expire naturally via tick()

Building Blocks

grafos_kv::{KvBuilder, FabricKvStore} — source
grafos_leasekit::{RenewalManager, RenewalPolicy} — source
grafos_observe for hit rate and churn

Related API docs:

Design

FabricKvStore Handles TTL Natively

FabricKvStore provides per-key TTL out of the box. Each key tracks its creation time and TTL; on get(), the TTL is refreshed (created_at reset to now). This means hot objects stay alive simply by being read, with no explicit renewal logic in your application.

Cold objects are evicted by calling tick(), which scans the hot tier and removes expired entries.

Lease Renewal for the Backing Store

The FabricKvStore is backed by a FabricHashMap over leased DRAM. Use a RenewalManager to keep the underlying shard leases alive as long as the cache is in use.

Avoid Renew-On-Every-Request

The KV store already handles per-key TTL refresh on access. For the backing lease renewal, use RenewalPolicy with a threshold (e.g. renew at 75% elapsed) to avoid unnecessary churn:

use grafos_leasekit::RenewalPolicy;

let policy = RenewalPolicy {
    renew_at_fraction: 0.75,
    jitter_fraction: 0.10,
    ..RenewalPolicy::default()
};

Walkthrough (Implementation Sketch)

Core grafOS API Path

The cache is a FabricKvStore; key TTL is maintained by KV operations, while the backing lease lifetime is driven by RenewalManager.

use grafos_kv::{FabricKvStore, KvBuilder};
use grafos_leasekit::{RenewalManager, RenewalPolicy};

let mut cache: FabricKvStore = KvBuilder::new()
    .hot_buckets(256)
    .default_ttl_secs(300)
    .build()?;

cache.put_with_ttl(b"/models/embedder.bin", b"bytes", 600)?;
let hit = cache.get(b"/models/embedder.bin")?;
let evicted = cache.tick()?;

let mut renewals = RenewalManager::new();
renewals.register(shard_lease_id, shard_expiry, RenewalPolicy::default());
let summary = renewals.tick(now);
# let _ = (hit, evicted, summary);
# Ok::<(), grafos_std::FabricError>(())

1. Create the Cache

use grafos_kv::{KvBuilder, FabricKvStore};

let mut cache: FabricKvStore = KvBuilder::new()
    .hot_buckets(256)
    .default_ttl_secs(300)
    .build()?;

2. On Cache Miss

Fetch from origin, insert with a TTL appropriate for the content type:

let content = fetch_from_origin(&url)?;
cache.put_with_ttl(url.as_bytes(), &content, 600)?; // 10 min TTL

3. On Cache Hit

Serve bytes. The TTL is refreshed automatically by get():

if let Some(bytes) = cache.get(url.as_bytes())? {
    serve(bytes);
}

4. Background Maintenance

Periodically prune expired entries and renew backing leases:

use grafos_leasekit::{RenewalManager, RenewalPolicy};

let mut renewal_mgr = RenewalManager::new();
renewal_mgr.register(shard_lease_id, shard_expiry, RenewalPolicy::default());

loop {
    let evicted = cache.tick()?;
    let summary = renewal_mgr.tick(now);
    // summary.renewed, summary.failed, summary.near_expiry
    sleep(Duration::from_secs(5));
}

Failure Modes

expired entry on get(): returns None and removes from hot tier automatically
Disconnected: treat as miss or fail depending on SLO

Observability

Track:

hit rate (get returns Some vs None)
evictions per tick
active keys (cache.keys()?.len())
remaining TTL for hot objects (cache.ttl(key)?)

Variations

multi-tier cache: enable persistence feature for hot/cold tiers — hot in DRAM, cold spills to block storage
per-key TTL classes: use put_with_ttl() with different TTLs for different content types
structured data: use put_struct()/get_struct() for typed cache entries