Skip to content

Pipelines & Workflows

ETL, streams, message bus, MapReduce, event pipelines, durable workflow engines.

When to read this section

You are building something that moves data through stages: an ETL job, an event pipeline, a message bus, a workflow engine. The naive approach (a long-lived daemon with checkpoint files) breaks across upgrades, drains, and provider migrations. These recipes show pipelines whose stages relocate, whose backpressure is real, and whose teardown does not strand resources.

Suggested order

  1. Stream Pipeline — the smallest stages-and-queues program.
  2. Event-Driven Workflow Engine in 200 Lines — the smallest durable-state workflow.
  3. MapReduce Job That Finishes With Zero Cleanup — burst compute that owns its lifetime.
  4. A Message Bus That Forgets on a Schedule and Event-Driven Pipeline Without Network Hops — production variants.
  5. ETL Pipeline With Automatic Resource Teardown — the cost-aware variant.
  6. Mirrored ETL Pipeline — the multi-cloud variant.
  7. Self-Healing Stream Pipeline and Self-Healing Pipeline That Moves Mid-Flight — the failure-recovery variants. Read these only after you have a green basic pipeline; they answer “what happens when a stage’s node drains.”

What’s not here

Real-time audio with deterministic latency. See Placement, Scaling & Operations. Multi-tenant batched inference (a different shape of “pipeline”). See GPU & Inference.