Tag: gpu

Recipes that run on a leased GPU. Includes single-tenant CUDA basics, shared inference servers, and the revocation/recovery story for mid-decode lease loss.

20 recipes carry this tag, ordered by recipe number:

Recipe 5 — An Inference Server That Shares GPUs Without Containers · gpu-and-inference/shared-inference
Recipe 11 — A Development Environment That Leases a GPU for Five Minutes · gpu-and-inference/basics
Recipe 18 — Borrowed GPU Studio · gpu-and-inference/basics
Recipe 29 — Running a CUDA Kernel on a Leased GPU · gpu-and-inference/basics
Recipe 30 — Leasing a Slice of a GPU for a Multi-Kernel Workload · gpu-and-inference/basics
Recipe 32 — 1000 GPUs for One Second · gpu-and-inference/basics
Recipe 33 — GPU as a Pure Function Call · gpu-and-inference/basics
Recipe 34 — Parallel GPU Sessions for Multi-Kernel Burst · gpu-and-inference/basics
Recipe 52 — Clean Preemptible GPU Training Job · gpu-and-inference/revocation-and-recovery
Recipe 53 — Per-Request Leased Inference Gateway · gpu-and-inference/shared-inference
Recipe 56 — GPU Generation Targeting · gpu-and-inference/shared-inference
Recipe 61 — Verifying Engine Output Against a Canonical Reference · gpu-and-inference/correctness-and-memory
Recipe 62 — Loading an LLM Without f32-at-Load Memory Blowup · gpu-and-inference/correctness-and-memory
Recipe 63 — Handling Mid-Kernel Lease Revocation in a Decode Loop · gpu-and-inference/revocation-and-recovery
Recipe 64 — Detecting a FENCED Lease State After Revocation · gpu-and-inference/revocation-and-recovery
Recipe 65 — Composing Two Fabric-Leased Engines for Speculative Decode · gpu-and-inference/correctness-and-memory
Recipe 66 — Batching Four Tenants Into One Decode Forward Pass · gpu-and-inference/shared-inference
Recipe 67 — Hot-Rebind Inference Continuity After a Lease Revocation · gpu-and-inference/revocation-and-recovery
Recipe 68 — Continuous Batching With Per-Tenant Quotas · gpu-and-inference/shared-inference
Recipe 69 — Per-Request Audit Attribution for Inference · gpu-and-inference/audit-and-attribution

← All tags