Skip to content

Recipe 40 - Replicated Session Continuation

Problem: A user signs in through one allowed cloud or region, then that domain becomes unavailable. The session should continue from another allowed domain without rebuilding a cloud-specific stack of Redis replication, DNS failover, dedupe tables, and stale-write guards.

Solution: Model the session as a replicated resource:

  • append every session command to a replicated log;
  • project the current session view into a replicated map;
  • use the shared idempotency store for request/effect dedupe;
  • save large snapshots through replicated checkpoint/object state;
  • authorize failover domains with PlacementPolicy.

The compiled recipe lives in cookbook/recipe-40-replicated-session-continuation. It uses public grafos-replicated resource handles. There are no mocks, hidden sync loops, or provider fallbacks.

use cookbook_recipe_40_replicated_session_continuation::{
aws_zone, gcp_region, CrossDomainSessionStore,
};
use grafos_replicated::LogOffset;
let mut sessions = CrossDomainSessionStore::new()?;
sessions.create_session_from_domain(
aws_zone(),
"cmd-1",
"sess-1",
"user-1",
10,
)?;
let write = sessions.set_attribute_from_domain(
aws_zone(),
"cmd-2",
"sess-1",
"cart",
"full",
)?;
let restored = sessions
.restore_session_from_domain(gcp_region(), "sess-1", write.offset)?
.expect("session view is available from the explicitly allowed GCP domain");
assert_eq!(restored.last_event_offset, LogOffset(1));
# Ok::<(), cookbook_recipe_40_replicated_session_continuation::SessionStoreError>(())

Core grafOS API Path

CrossDomainSessionStore is a typed session service over four public replicated resources:

use fabricbios_core::lease::FenceEpoch;
use grafos_replicated::{
LogicalResourceName, PolicyHash, ReplicaHealth, ReplicaId, ReplicaLocator,
ReplicaPolicy, ReplicaRole, ReplicaSetLocator, ReplicatedCheckpoint,
ReplicatedFabricLog, ReplicatedIdempotencyStore, ReplicatedMap,
ResourceGeneration, SchemaId,
};
use cookbook_recipe_40_replicated_session_continuation::{
aws_zone, gcp_region, session_placement, SessionEvent, SessionView,
};
let placement = session_placement();
let replicas = ReplicaPolicy::new(placement)
.min_replicas(2)
.write_quorum(2)
.read_quorum(1);
let generation = ResourceGeneration(1);
let locator = ReplicaSetLocator::new(
generation,
vec![
ReplicaLocator {
replica_id: ReplicaId::new("sessions-aws-a"),
domain: aws_zone(),
role: ReplicaRole::Voter,
health: ReplicaHealth::Healthy,
epoch: FenceEpoch(1),
content_generation: generation.0,
},
ReplicaLocator {
replica_id: ReplicaId::new("sessions-gcp-central"),
domain: gcp_region(),
role: ReplicaRole::Voter,
health: ReplicaHealth::Healthy,
epoch: FenceEpoch(1),
content_generation: generation.0,
},
],
);
let writer_epoch = FenceEpoch(1);
let events = ReplicatedFabricLog::<SessionEvent>::new(
LogicalResourceName::new("sessions"),
SchemaId::new("session-event.v1"),
writer_epoch,
replicas.clone(),
locator.clone(),
)?;
let views = ReplicatedMap::<String, SessionView>::new(
LogicalResourceName::new("session-views"),
SchemaId::new("session-view.v1"),
writer_epoch,
replicas.clone(),
locator.clone(),
)?;
let effects = ReplicatedIdempotencyStore::new(
LogicalResourceName::new("session-effects"),
SchemaId::new("session-effect.v1"),
writer_epoch,
replicas.clone(),
locator.clone(),
)?;
let checkpoints = ReplicatedCheckpoint::new(
LogicalResourceName::new("session-checkpoints"),
SchemaId::new("session-checkpoint.v1"),
writer_epoch,
replicas,
locator,
PolicyHash([40; 32]),
)?;
# let _ = (events, views, effects, checkpoints);
# Ok::<(), grafos_replicated::ReplicatedError>(())

Session commands append to ReplicatedFabricLog, materialize into ReplicatedMap, dedupe through ReplicatedIdempotencyStore, and restore large snapshots through ReplicatedCheckpoint. A failover read uses ReadConsistency::AtLeastOffset(write.offset) rather than trusting a stale local view.

What The Recipe Proves

  • A session written in one allowed failure domain can be read from another allowed failure domain once the materialized view has reached the requested log offset.
  • A cloud or region that is not authorized by placement fails closed with DomainUnavailable; the recipe does not try another provider for the caller.
  • Duplicate session commands replay the original accepted offset and do not append a second event.
  • Checkpoint restore reads the latest replicated checkpoint bytes from another allowed domain.
  • Freshness reads use ReadConsistency::AtLeastOffset, so callers can demand “restore only after this committed session event is visible.”

Runtime Shape

  1. The ingress handler checks that its domain is authorized by placement.
  2. The command fingerprint is derived from the logical resource name, resource kind, operation, schema id, and canonical JSON command payload.
  3. The idempotency store rejects repeated keys with different fingerprints and replays repeated accepted commands.
  4. The command appends to the replicated log.
  5. The map projection applies the event with CAS against the current session view.
  6. The accepted offset is recorded in the idempotency store.
  7. A failover reader asks for the session view at or beyond the offset it needs.
  8. Optional snapshots publish through the replicated checkpoint/object path.

This recipe is the cross-domain companion to Recipe 27. Recipe 27 remains the lease-local session-renewal pattern; it is not a cross-region or cross-provider continuation contract by itself.

Tests

Run it with:

Terminal window
cargo test -p cookbook-recipe-40-replicated-session-continuation

The tests cover:

  • AWS-to-GCP session resume under explicit placement;
  • disallowed provider fail-closed behavior;
  • duplicate command replay;
  • replicated checkpoint restore;
  • read freshness enforcement;
  • real SHA-256 content hashes.

See also:

  • docs/cookbook/recipe-27-resilient-session-store.md
  • crates/grafos-replicated