Recipe 41 - Durable Async Report API
What You Build
Build a report API that accepts long-running work without losing requests during an availability-zone, region, or provider failure.
The service has three ordinary pieces:
POST /reportsaccepts a request and returns a receipt only after the command is quorum-committed.GET /reports/{request_id}reads status from replicated state, with a freshness requirement tied to the receipt offset.- Worker processes in explicitly allowed domains complete accepted requests and record the external effect once.
This replaces the usual pile of cloud queue, status database, dedupe table, worker checkpoint, and failover script with one replicated-resource design: ordered commands, map-backed status, idempotent acceptance, idempotent effects, and explicit placement.
The compiled recipe lives in cookbook/recipe-41-durable-async-api and uses
public grafos-replicated handles. There are no mocks, hidden sync loops, or
provider fallbacks.
Program
use cookbook_recipe_41_durable_async_api::{ aws_zone, gcp_region, AsyncApiError, ReportApiService, ReportSubmission, RequestState,};
fn main() -> Result<(), AsyncApiError> { let mut service = ReportApiService::cross_provider()?;
let receipt = service.submit_report( aws_zone(), ReportSubmission { request_id: "req-2026-04-acct-1".into(), account_id: "acct-1".into(), month: "2026-04".into(), }, )?;
assert_eq!(receipt.status_path, "/reports/req-2026-04-acct-1");
service.worker_tick(gcp_region(), "worker-gcp")?;
let status = service .read_status(gcp_region(), &receipt)? .expect("request status");
assert!(matches!(status.state, RequestState::Completed { .. })); Ok(())}The important part is the receipt. It carries the committed log offset. A caller that retries against another allowed domain can ask for status at that offset instead of accepting a stale local projection.
Core grafOS API Path
The service facade is built from a replicated command log, a replicated status map, and an idempotency store:
use fabricbios_core::lease::FenceEpoch;use grafos_replicated::{ LogicalResourceName, ReplicatedFabricLog, ReplicatedIdempotencyStore, ReplicatedMap, SchemaId,};use cookbook_recipe_41_durable_async_api::{ cross_provider_profile, ApiCommand, RequestRecord,};
let profile = cross_provider_profile();let writer_epoch = FenceEpoch(1);let replicas = profile.replica_policy;let locator = profile.locator;
let commands = ReplicatedFabricLog::<ApiCommand>::new( LogicalResourceName::new("report-requests"), SchemaId::new("report-command.v1"), writer_epoch, replicas.clone(), locator.clone(),)?;
let requests = ReplicatedMap::<String, RequestRecord>::new( LogicalResourceName::new("report-request-status"), SchemaId::new("report-status.v1"), writer_epoch, replicas.clone(), locator.clone(),)?;
let effects = ReplicatedIdempotencyStore::new( LogicalResourceName::new("report-effects"), SchemaId::new("report-effect.v1"), writer_epoch, replicas, locator,)?;# let _ = (commands, requests, effects);# Ok::<(), grafos_replicated::ReplicatedError>(())Submission reserves an idempotency key, appends an ApiCommand, writes
Accepted into the status map, and completes the idempotency record with the
accepted offset. Status reads use ReadConsistency::AtLeastOffset from the
receipt, which is the grafOS detail the HTTP-shaped facade is meant to teach.
Service Flow
- Ingress checks that its failure domain is allowed by the service placement policy.
- The request id and report payload produce a canonical fingerprint.
- The idempotency store reserves the request id.
- The command appends to
ReplicatedFabricLog<ApiCommand>. ReplicatedMap<String, RequestRecord>recordsAcceptedat the committed offset.- The idempotency record is completed with the accepted offset.
- The API returns a receipt with
request_id,accepted_offset,duplicate, andstatus_path. - Workers scan accepted commands, reserve an effect key, perform the work, and
CAS the request status to
Completed. - Status reads use
ReadConsistency::AtLeastOffset(receipt.accepted_offset).
Placement Variants
The same service code can run under different resilience envelopes. The recipe crate exposes these profiles so the policy choice is visible in code:
use cookbook_recipe_41_durable_async_api::{ cross_provider_profile, cross_region_profile, multi_az_profile, single_az_profile, AsyncApiError, DurableAsyncApi,};
fn main() -> Result<(), AsyncApiError> { let single_az = DurableAsyncApi::from_profile(single_az_profile())?; let multi_az = DurableAsyncApi::from_profile(multi_az_profile())?; let cross_region = DurableAsyncApi::from_profile(cross_region_profile())?; let cross_provider = DurableAsyncApi::from_profile(cross_provider_profile())?; let _ = (single_az, multi_az, cross_region, cross_provider); Ok(())}single_az_profile()is useful for local development or a deliberately narrow deployment. A worker in another AZ is refused.multi_az_profile()allows movement inside one AWS region across distinct availability zones.cross_region_profile()allows movement between AWS regions.cross_provider_profile()allows AWS ingress and GCP workers because the program explicitly authorized both providers.
Placement is authorization, not a suggestion. A request for a domain outside
the profile fails closed with DomainUnavailable; the recipe does not try
another cloud behind the caller’s back.
Failure Behavior
- Client retry after timeout: submitting the same request id and payload returns the original receipt and does not append another command.
- Changed retry payload: submitting the same request id with a different
payload fails closed with
IdempotencyConflict. - Ingress domain unavailable:
POST /reportsfails withDomainUnavailable. - Worker in an unauthorized domain: the worker tick fails with
DomainUnavailable; the scheduler is not asked to find a different provider. - Worker repeats completed work: no second result is produced because the
status map is already
Completedand the effect key is idempotent. - Failover status read is stale:
AtLeastOffsetfails until the projection has reached the receipt’s accepted offset.
Run And Verify
Run the compiled recipe:
cargo test -p cookbook-recipe-41-durable-async-apiThe tests prove:
POST /reportsstyle submission, worker completion, and status read through theReportApiServicefacade;- accepted work can complete from another explicitly allowed cloud;
- placement profiles change the allowed failure domains without changing service logic;
- duplicate request receipt replay;
- changed duplicate payload fail-closed behavior;
- disallowed ingress fail-closed behavior;
- completed work is not processed twice;
- freshness reads wait for the requested committed offset.
Adapt It
Change these knobs first:
- Placement profile: choose single-AZ, multi-AZ, cross-region, or cross-provider based on where the service is allowed to run.
- Quorum: adjust
ReplicaPolicyonly when the durability/latency tradeoff is intentional. - Request id: use a stable client idempotency key, not a random server id, if callers retry after network timeouts.
- Effect key: include the external system and request id so a worker retry cannot write the same report twice.
- Freshness: use
AtLeastOffset(receipt.accepted_offset)for user-visible reads after failover.
See also:
- Recipe 39: Cross-Cloud Order Pipeline
- Recipe 40: Replicated Session Continuation
crates/grafos-replicated