Tag: gpu
Recipes that run on a leased GPU. Includes single-tenant CUDA basics, shared inference servers, and the revocation/recovery story for mid-decode lease loss.
20 recipes carry this tag, ordered by recipe number:
- Recipe 5 — An Inference Server That Shares GPUs Without Containers ·
gpu-and-inference/shared-inference - Recipe 11 — A Development Environment That Leases a GPU for Five Minutes ·
gpu-and-inference/basics - Recipe 18 — Borrowed GPU Studio ·
gpu-and-inference/basics - Recipe 29 — Running a CUDA Kernel on a Leased GPU ·
gpu-and-inference/basics - Recipe 30 — Leasing a Slice of a GPU for a Multi-Kernel Workload ·
gpu-and-inference/basics - Recipe 32 — 1000 GPUs for One Second ·
gpu-and-inference/basics - Recipe 33 — GPU as a Pure Function Call ·
gpu-and-inference/basics - Recipe 34 — Parallel GPU Sessions for Multi-Kernel Burst ·
gpu-and-inference/basics - Recipe 52 — Clean Preemptible GPU Training Job ·
gpu-and-inference/revocation-and-recovery - Recipe 53 — Per-Request Leased Inference Gateway ·
gpu-and-inference/shared-inference - Recipe 56 — GPU Generation Targeting ·
gpu-and-inference/shared-inference - Recipe 61 — Verifying Engine Output Against a Canonical Reference ·
gpu-and-inference/correctness-and-memory - Recipe 62 — Loading an LLM Without f32-at-Load Memory Blowup ·
gpu-and-inference/correctness-and-memory - Recipe 63 — Handling Mid-Kernel Lease Revocation in a Decode Loop ·
gpu-and-inference/revocation-and-recovery - Recipe 64 — Detecting a FENCED Lease State After Revocation ·
gpu-and-inference/revocation-and-recovery - Recipe 65 — Composing Two Fabric-Leased Engines for Speculative Decode ·
gpu-and-inference/correctness-and-memory - Recipe 66 — Batching Four Tenants Into One Decode Forward Pass ·
gpu-and-inference/shared-inference - Recipe 67 — Hot-Rebind Inference Continuity After a Lease Revocation ·
gpu-and-inference/revocation-and-recovery - Recipe 68 — Continuous Batching With Per-Tenant Quotas ·
gpu-and-inference/shared-inference - Recipe 69 — Per-Request Audit Attribution for Inference ·
gpu-and-inference/audit-and-attribution