Recipe 17: Live Event Mode (Flash-Crowd Autopilot)
Situation
Traffic spikes are unpredictable and short-lived. The worst outcome is:
- manual incident response to add capacity
- capacity lingering after the event
You want the system to scale out and back in automatically.
What You Build
A hot-object replication pattern:
- start with one cached copy
- when load/latency exceeds thresholds, acquire more leases and replicate
- route reads across replicas
- stop renewing extra replicas when demand drops; let TTL expire
Building Blocks
MemBuilderleases for replicasFabricDnsfor routing endpointsgrafos_observefor p99, lease churn, replica counts
Related API docs:
Design
Control Loop
Inputs:
- QPS
- p95/p99 latency
- error rate
Actions:
- add replica (acquire lease, copy bytes, register)
- remove replica (stop renewing)
Safety
Avoid thrash:
- hysteresis thresholds
- minimum time between scale actions
Also add:
- hard cap on replicas per object
- cool-down period after a scale action
Replica Placement (Locality)
Even in a fabric, locality matters:
- replicas in the same rack reduce tail latency for a rack-local flash crowd
- a remote replica may help throughput but add latency
Placement policy is “policy, not mechanism”. The recipe assumes you can prefer nearby leases or adapt by observing latency and selecting the best-performing replica set over time.
Routing Model
You need a routing layer that can:
- discover the current replica set
- distribute reads across replicas
Simple approach:
FabricDnsname -> list of replica endpoints (or a per-object name)- client selects replica by hash (request id) or least-loaded measurement
More advanced:
- coordinator pushes replica set updates to clients via watch-like broadcasts
Walkthrough
1. Detect Hot Object
Detect with any cheap signal:
- QPS over the last N seconds
- p95/p99 latency increase
- origin fetch rate
2. Allocate Replica Leases
When scaling out:
- acquire a new memory lease for the object bytes
- copy object bytes into that lease
- register the replica endpoint in discovery (
FabricDnsor equivalent)
3. Renew While Hot
Renew replica leases while demand is above threshold. Avoid per-request renewal:
- renew when remaining TTL < 25%
- apply jitter so all replicas do not renew at once
4. Route Reads
Clients select among replicas:
- consistent hashing (stable distribution)
- random choice (good enough)
- least-loaded (requires feedback)
5. Scale Back In
When demand drops below the lower threshold for long enough:
- stop renewing extra replicas
- optionally deregister their endpoints
- let leases expire naturally
Failure Modes
- Replica lease expires unexpectedly: client falls back to another replica or origin.
- Discovery staleness: clients may attempt dead replicas; implement quick failover and retry.
- Thrash: fix with hysteresis and cooldown.
Observability
Track:
replicas_active{object_id=...}replica_scale_out_total,replica_scale_in_total- per-object hit rate and origin fetch rate
- lease churn and renewal errors
Variations
- stripe an object across multiple leases for parallel reads (for very large objects)
- multi-region: maintain independent replica sets per locality domain