Skip to content

Resource Isolation And Exclusivity Semantics

This note describes a more explicit way to model performance isolation in fabricBIOS and grafOS.

The core idea is:

  • capacity answers “how much hardware is reserved?”
  • execution mode answers “how does the workload run on that hardware?”
  • isolation / exclusivity policy answers “how much interference from other workloads is allowed?”

These concerns are related, but they are not the same thing. The repo already separates lease width from tasklet execution width for CPU tasklets. This document argues that isolation should also be represented explicitly instead of being inferred indirectly from lease size.

Related documents:

  • docs/spec/resource-types.md
  • docs/tasklet-profile-v0.md
  • docs/grafos/tasklet-parallelism-model.md
  • docs/grafos/shared-memory-tasklet-model.md

1. Problem statement

Today it is easy for a user to infer the wrong meaning from a request such as “lease 4 CPU cores.”

That request could mean several different things:

  • “run my workload in parallel across 4 workers”
  • “reserve enough capacity for 4 separate tasklets”
  • “give my single-threaded tasklet stronger isolation from other tenants”
  • “reserve scheduler headroom because this stage may later fan out”

Those are different intents. A single numeric lease width is a poor proxy for all of them.

The repo has already addressed part of this problem for CPU tasklets:

  • num_cores reserves CPU capacity
  • tasklet execution width stays separate
  • shared-memory tasklets are an explicit execution mode, not an accidental consequence of .cores(n)

That same style should be carried further. Users who want performance predictability should be able to ask for isolation directly rather than smuggling that request through lease width.


2. Design goals

This model should:

  • make user intent explicit
  • reduce accidental over-reservation of hardware
  • preserve the lease-width vs execution-width separation
  • work across CPU and GPU resources
  • let schedulers reason about density vs predictability
  • expose operator-visible policy in inventory and logs

This model should not:

  • turn fabricBIOS into a process scheduler
  • imply POSIX-like threading semantics
  • promise hardware isolation properties that the runtime cannot actually enforce

3. Common model

Every resource request should be decomposable into three axes.

3.1 Capacity

How much of the resource is reserved.

Examples:

  • CPU: num_cores = 4
  • MEM: mem_bytes = 256 MiB
  • GPU: vram_bytes = 8 GiB, compute_slices = 1

3.2 Execution mode

How software is allowed to execute against that capacity.

Examples:

  • CPU tasklet profile v0: single-threaded tasklet execution
  • CPU tasklet profile v1: explicit shared_memory_tasklet
  • GPU: stateless kernel submit vs persistent session vs future graph/session execution modes

3.3 Isolation / exclusivity policy

How much cross-tenant sharing is allowed for the reserved hardware.

Examples:

  • best-effort sharing
  • whole-core exclusivity
  • strict isolated placement
  • GPU session-exclusive
  • GPU device-exclusive

The important rule is:

  • capacity does not imply execution mode
  • capacity does not imply isolation
  • isolation does not imply parallel execution

4. CPU interpretation

For CPUs, the following distinctions matter.

4.1 Single-threaded but isolation-sensitive work

Some workloads cannot parallelize well, but still benefit from reduced interference:

  • low-jitter control loops
  • latency-sensitive parsers or codecs
  • cache-sensitive single-threaded compute
  • workloads harmed by SMT sibling contention

Those workloads may want:

  • execution width = 1
  • capacity = small
  • isolation = strong

That intent is better expressed as an isolation policy than as “lease more cores and hope that implies exclusivity.”

4.2 CPU policy classes

The current Linux path already uses CPU isolation policies such as:

  • BestEffort
  • WholeCore
  • StrictIsolated

That is the right general shape. Over time, CPU requests should read more like:

  • capacity: 1 core
  • execution mode: single-threaded tasklet
  • isolation: whole_core

or:

  • capacity: 8 cores
  • execution mode: shared_memory_tasklet
  • isolation: strict_isolated

4.3 Bare-metal consequence

On runtimes that still execute tasklets single-threaded, leasing multiple CPU cores should not be the only way to ask for stronger performance isolation.

If the runtime cannot turn wider capacity into useful execution width, the user should still have a clear way to ask for:

  • exclusive core ownership
  • no cross-tenant sibling sharing
  • stronger cache/topology constraints

That argues for explicit isolation/exclusivity fields rather than only wider leases.

Bare-metal semantics locked: see docs/spec/bare-metal-cpu-lease-semantics.md. Bare-metal runtimes reject wider-than-execution-width CPU leases for single-threaded tasklets; clients use TLV_LEASE_CPU_ISOLATION or rely on the daemon-wide --cpu-isolation-policy default. Linux is unaffected.


5. GPU interpretation

The same conceptual split applies to GPUs.

5.1 Capacity

Examples:

  • VRAM bytes
  • compute partitions / slices
  • queue slots or session slots

5.2 Execution mode

Examples:

  • one-shot kernel submit
  • persistent GPU session
  • future multi-stage session or command graph mode

5.3 Isolation / exclusivity

Examples:

  • shared: device may multiplex other tenants
  • session-exclusive: a session has exclusive residency/state for its lifetime
  • device-exclusive: one tenant gets the whole accelerator
  • future partition-exclusive: one tenant gets an isolated hardware partition if the device exposes one

This is more honest than trying to encode GPU isolation indirectly through raw VRAM or “number of GPUs” alone.

GPU wire shape locked: see docs/spec/gpu-exclusivity-wire-format.md. Per-lease GPU exclusivity is carried as TLV_LEASE_GPU_EXCLUSIVITY (0x0903) on the existing LeaseAllocRequest params blob. Absent TLV inherits daemon-wide --gpu-share-mode. Unsupported classes fail closed. Daemon mode is a permission envelope, not a default-only hint.


6. Inventory and request consequences

If this model is adopted, the system should expose and consume isolation more explicitly.

6.1 Inventory / discovery

Inventory should advertise:

  • what isolation classes exist for each resource type
  • which class is currently active by default
  • whether the runtime can enforce the requested class or only best effort

For CPU, some of this already exists through isolation/topology flags. Equivalent GPU advertisement will be needed as GPU sessions become richer.

6.2 Lease requests

Lease requests should be able to express:

  • capacity
  • execution mode where relevant
  • requested isolation / exclusivity class

A request should fail closed when the node cannot honor the requested isolation/exclusivity semantics.

CPU wire shape locked: see docs/spec/cpu-isolation-wire-format.md. Per-lease CPU isolation is carried as TLV_LEASE_CPU_ISOLATION (0x0902) on the existing LeaseAllocRequest params blob, mirroring TLV_LEASE_INTENT_KV_CACHE. Unsupported classes fail closed. Implementation is a separate wave; see §7 of that note.

6.3 Scheduler / admission

Schedulers should be able to trade:

  • density
  • fairness
  • predictability

without guessing user intent from lease width alone.

Scheduler policy locked: see docs/spec/scheduler-isolation-policy.md. Adopts a filter→score→adapt pipeline shared with Phase 48.7 affinity. Isolation is filter-only (binary), not scored. Tenant priority is orthogonal to per-lease isolation. Rejection reasons are structured and distinguish permanent (no node supports class) from transient (contended) from policy (daemon-mode conflict). Inventory advertisement is a hard prerequisite — until nodes advertise supported classes, the scheduler fails closed on any request stricter than BestEffort/Shared.


7. SDK consequences

The SDK should not force users to communicate isolation indirectly via lease width if that is not what they mean.

For example, the long-term CPU API should trend toward something like:

CpuBuilder::new()
.single_core()
.isolation(IsolationPolicy::WholeCore)
.lease_secs(60)
.acquire()?;

rather than encouraging:

CpuBuilder::new()
.cores(4) // hoping this means "give my single thread a quieter core"
.lease_secs(60)
.acquire()?;

Likewise for GPU, session exclusivity should eventually be requested directly instead of inferred from larger capacity asks.


The repo should evolve toward a common resource model in which:

  • capacity, execution mode, and isolation are distinct axes
  • CPU isolation is requested explicitly instead of inferred from wider single- threaded leases
  • GPU exclusivity follows the same pattern
  • unsupported isolation classes fail closed

This does not require every resource type to implement the same policies. It only requires the system to expose the concept consistently and honestly.


9. Initial follow-on work

The next concrete steps are:

  1. define the common vocabulary in spec/docs
  2. decide how CPU lease requests expose isolation separately from execution width
  3. define the analogous GPU exclusivity vocabulary
  4. align inventory and scheduler reporting with that model
  5. update SDK builders so examples express isolation explicitly

This should be tracked as a dedicated follow-on phase rather than being folded implicitly into unrelated CPU or GPU work.