fabricBIOS Specification v0.1
Abstract
fabricBIOS is a minimal firmware specification for disaggregated computing fabrics. It enables nodes to advertise hardware resources, establish trust, exchange capability tokens, and create lease-based bindings to standard data planes (RDMA, NVMe-oF, vendor GPU fabrics, and future fabrics such as CXL).
fabricBIOS is not an operating system. It exposes resources and enforces access control and lease expiry; policy, scheduling, and placement decisions live above.
The name reflects its role: a “BIOS” for the fabric—the layer below the OS that exports hardware resources over the network.
See Premium Dataplane Methodology for the canonical reference on fabricBIOS’s premium dataplane model (RDMA, NVMe-oF, SR-IOV, GPU, CXL).
1. Design Principles
- Minimal TCB: Small enough to verify, audit, and run in firmware/DPU, or behind a small trusted proxy.
- Mechanism, not policy: Exposes capabilities and enforces safety-critical limits only. Scheduling, isolation, paging, fairness, and optimization are OS concerns.
- Leverage existing data planes: Discover and authorize RDMA/NVMe-oF; do not reimplement bulk transfer.
- Node-addressed, resource-identified: Nodes have IPv6 addresses; resources have stable UUIDs.
- Strong security: Cryptographic signatures, audience binding, anti-replay, and lease-based revocation.
- Lease-oriented: Data-plane bindings are leases with explicit lifetime, renewal, and mandatory teardown.
- Interop-first: Unambiguous wire format, byte order, signature rules, fragmentation behavior, and replay handling.
- Profiles for feasibility: Multiple secure transport profiles accommodate both DPU-class devices and minimal endpoints.
2. Scope
2.1 Responsibilities
| Responsibility | Description |
|---|---|
| Discovery | Advertise node identity, locality, and resource inventory |
| Trust bootstrap | Identity, attestation hooks, fabric enrollment |
| Capability exchange | Mint, attenuate, validate, revoke capability tokens |
| Lease management | Create, renew, revoke leases; enforce expiry |
| Data plane binding | Provide endpoints/credentials for RDMA/NVMe-oF/vendor protocols |
| Safety enforcement | Rate-limit unauthenticated traffic; enforce lease teardown; reject invalid tokens |
2.2 Non-responsibilities
| Not fabricBIOS | Why |
|---|---|
| CPU execution / scheduling | Policy; requires isolation, preemption, accounting |
| Memory paging | Policy; OS decides faulting, caching, placement |
| Filesystem semantics | Policy; fabricBIOS exposes block endpoints |
| QoS fairness | Policy; OS/fabric decides allocation; fabricBIOS may enforce safety limits only |
| Process isolation | OS/hypervisor concern |
| Global optimization | OS planners/optimizers |
CPU resources: fabricBIOS advertises CPU topology/capacity so a higher-layer OS can schedule compute across nodes, but fabricBIOS does not execute workloads.
3. Addressing Model
3.1 Node Addressing
Each node has one primary IPv6 address for the fabricBIOS control endpoint:
fd00:FABRIC:SITE:NODE::1/64Conventions:
- Operator allocates a /48 ULA prefix (e.g.,
fd00:abcd::/48) - Sites/racks get /56 or /60
- Nodes get /64;
::1is the fabricBIOS control endpoint
Example:
fd00:abcd:0001:0001::1 ← Site 1, Node 1fd00:abcd:0001:0002::1 ← Site 1, Node 2fd00:abcd:0002:0001::1 ← Site 2, Node 13.2 Resource Identification
Resources are identified by 128-bit UUIDs carried in protocol payloads.
- UUIDs are transmitted as 16 bytes in canonical RFC 4122 byte order.
- A structured UUID layout is recommended but not required:
Resource UUID (128 bits):┌────────────────┬────────────────┬────────────────┐│ Type (16) │ Node (48) │ Instance (64) │└────────────────┴────────────────┴────────────────┘4. Wire Format
4.1 Common Header
All fabricBIOS messages share a common header.
All multi-byte integer fields are big-endian (network byte order).
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Version | Msg Type | Flags |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Payload Length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Request ID (64) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Nonce (64) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Payload (variable) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Signature (Ed25519, 64B) || (if SIGNED flag) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Field definitions:
| Field | Size | Description |
|---|---|---|
| Version | 8 bits | Protocol version (0x01) |
| Msg Type | 8 bits | Message type (§4.2) |
| Flags | 16 bits | Flags (§4.3) |
| Payload Length | 32 bits | Payload bytes only (excludes signature) |
| Request ID | 64 bits | Correlation ID; responses echo this |
| Nonce | 64 bits | Anti-replay; semantics depend on flags (§4.5) |
| Frag Offset | 32 bits | If FRAG_V2: byte offset within full payload |
| Frag Total Len | 32 bits | If FRAG_V2: total unfragmented payload size |
| Payload | var | Type-specific payload |
| Signature | 64 bytes | Ed25519 signature if SIGNED flag set |
If FRAG_V2 is unset, Frag Offset and Frag Total Len are omitted from the header.
4.1.1 Canonical variable-length encoding (v0)
bytes := u32 len + len bytes(reject over-limit).list := u16 count + repeated items(reject over-limit).tlv := u8 type + u16 len + len bytes; unknown TLVs MUST be skippable.- Optional fields use a
u8 presentflag (0=absent,1=present) followed by the field bytes.
4.2 Message Types
| Code | Name | Direction | Signed |
|---|---|---|---|
| 0x01 | ANNOUNCE | Node → Fabric | Required |
| 0x02 | SOLICIT | Client → Fabric/Relay | Optional (recommended in routed fabrics) |
| 0x03 | WITHDRAW | Node → Fabric | Required |
| 0x10 | REQUEST | Client → Node | Required (for any op beyond discovery) |
| 0x11 | RESPONSE | Node → Client | Required |
| 0x20 | REVOKE_BROADCAST | Node → Fabric | Required |
4.3 Flags
| Bit | Name | Meaning |
|---|---|---|
| 0 | SIGNED | Signature trailer present |
| 1 | COMPRESSED | Payload is zstd-compressed |
| 2 | CONTINUED | Fragmented message: more fragments follow |
| 3 | FINAL | Fragmented message: last fragment |
| 4 | NONCE_IS_TIMESTAMP | Nonce is UNIX seconds (u64). If unset, Nonce is random u64 |
| 5 | FRAG_V2 | Fragment metadata present (offset + total length) |
| 6–15 | Reserved | MUST be zero |
4.4 Signature and Compression Rules
- If
COMPRESSEDis set, payload bytes are compressed before signing. - Receivers MUST verify the signature before attempting decompression or deep parsing.
- Signature is computed over the exact on-wire bytes of header fields + payload (excluding the signature bytes).
- If fragmented, each fragment is signed independently over that fragment’s header+payload.
4.5 Anti-Replay and Nonce Handling
fabricBIOS supports two nonce modes:
Timestamp mode (NONCE_IS_TIMESTAMP=1):
Nonceis UNIX seconds.- Receiver MUST reject messages outside an allowed skew window (default ±300s).
- Receiver SHOULD keep a small replay cache
(sender_id, request_id, nonce)within the skew window to reject duplicates.
Random mode (NONCE_IS_TIMESTAMP=0):
Nonceis random u64.- Receiver MUST maintain a bounded replay cache per sender for at least
REPLAY_WINDOWseconds (default 300s) orMAX_NONCESentries (implementation-defined), evicting oldest.
4.6 Anti-DoS Requirements
Implementations MUST:
- Rate-limit unsigned messages (≤10/sec per source IP by default).
- For signed messages, perform cheap header sanity checks (version, flags, payload length bounds), then validate signature before parsing payload.
- Reject invalid or replayed nonces (§4.5).
- Drop malformed headers without processing payload.
- Enforce a maximum UDP payload size for discovery/control; large responses MUST use fragmentation flags (§5.6).
5. Discovery Protocol
5.1 Transport
Discovery uses UDP port 5700.
5.2 Multicast Groups
fabricBIOS defines well-known IPv6 multicast group IDs (private-by-spec). The group ID encodes ASCII “fbio” (0x66 62 69 6f) plus suffix 0x0001.
- Link-local:
ff02::6662:696f:0001 - Site-local:
ff05::6662:696f:0001 - Organization-local:
ff08::6662:696f:0001
In routed fabrics, multicast may be unavailable; discovery relays are recommended and often required (§5.5).
5.3 ANNOUNCE
ANNOUNCE is sent on boot, periodically (default 30s), and on resource change. Under the normative default profiles in §8.1-§8.4, ANNOUNCE MUST be signed. The only exception is the explicit trusted-fabric exception profile in §8.5.
ANNOUNCE Payload:
node_id : u128node_addr : 16B IPv6fabric_id : u64sequence : u64 (monotonic per node)locality : LocalityInfoattestation_tlv: TLV (optional)resources : ResourceSummary[]LocalityInfo:
rack_id : u32row_id : u32site_id : u32geo_hash : u64 (optional)custom : 32B operator-definedResourceSummary:
resource_id : u128type : u16flags : u16capacity : u64available : u64descriptors : DescriptorTLV[]endpoints : EndpointTLV[] (optional; may be omitted and fetched via GET_INVENTORY)5.4 SOLICIT
SOLICIT queries for resources. Signing SOLICIT is RECOMMENDED outside a single trusted L2 domain.
SOLICIT Payload:
query_type : u8 (0=all, 1=by_type, 2=by_node, 3=by_locality)filters : Filter[]Filter:
field : u8op : u8 (EQ, GT, LT, CONTAINS)value : 32B5.4.1 Filter Field Registry (v0)
Filter field IDs are u8 with the following ranges:
0x00reserved0x01..=0x3Fcore registry (this document)0x40..=0x7Fexperimental/extension0x80..=0xFFvendor-specific
Defined fields:
| Field | Name | Value encoding | Ops |
|---|---|---|---|
| 0x01 | RESOURCE_TYPE | value[0..2] = u16 BE resource type code | EQ |
| 0x02 | NODE_ID | value[0..16] = u128 BE | EQ |
| 0x03 | SITE_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x04 | ROW_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x05 | RACK_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x06 | LOCALITY_CUSTOM | value[0..32] opaque | EQ, CONTAINS |
| 0x07 | RESOURCE_FLAGS | value[0..2] = u16 BE bitmask | EQ, CONTAINS |
CONTAINS for LOCALITY_CUSTOM uses non-zero bytes as required matches. RESOURCE_FLAGS
uses a bitmask where FENCED=0x0001 and DEGRADED=0x0002; CONTAINS requires all bits set.
5.5 Discovery Relay
Multicast is frequently unreliable in routed leaf-spine networks. fabricBIOS supports a unicast discovery relay.
- Anycast relay address (operator-configured):
fd00:FABRIC::ffff:1 - Nodes periodically unicast ANNOUNCE to the relay.
- Clients unicast SOLICIT to the relay.
- Relay responds with RESPONSE messages containing aggregated ANNOUNCE payloads (may be chunked).
In routed fabrics, a discovery relay SHOULD be considered required unless the operator provisions and validates multicast routing reliability.
5.5.1 Relay Discovery Profile (RESPONSE)
The relay discovery profile defines how relays encode inventory responses using MsgType::RESPONSE.
This keeps RESPONSE as the message type while specifying a fixed payload layout for interop.
Payload encoding:
count : u16repeated count times: announce_payload_bytes : bytes (u32 length + bytes of AnnouncePayload encoding)Rules:
- Each list entry MUST be a complete
AnnouncePayloadencoded per this spec. - Relays MAY split inventory across multiple RESPONSE frames to honor MTU guidance (§5.6).
- Inventory ordering is implementation-defined; clients MUST treat the list as an unordered
snapshot and MAY sort by
node_idfor stable presentation. - Relays SHOULD include at most one entry per
node_id, choosing the most recently observed ANNOUNCE (or highestsequencewhen available). - Inventory may be truncated to fit relay caps; absence of a node in a RESPONSE does not imply withdrawal or fencing.
5.5.2 Relay verification requirements
Relays enforce the same replay protections as any other receiver:
- Relays MUST reject replayed or invalid nonces for ANNOUNCE, WITHDRAW, and SOLICIT.
- Relays SHOULD support a pinned trust bundle mode for verifying ANNOUNCE/WITHDRAW signatures. When configured with a trust bundle, relays MUST reject unsigned or invalidly signed frames.
5.5.3 Relay limits and defaults
Relays MUST bound in-memory caches and rate-limit unauthenticated traffic. Recommended defaults (subject to tuning):
max_nodes_cached=4096max_inventory_bytes=1MiBreplay_max_entries=4096rate_limit_max_senders=4096unsigned_per_sender_per_sec=10sig_verifies_per_sec=2000
Control servers SHOULD use comparable replay and rate-limit defaults for control-plane requests.
5.6 Chunking and MTU Guidance
To avoid IP fragmentation, implementations SHOULD limit UDP payloads to ≤1200 bytes. For larger inventories/responses:
- Use
CONTINUEDandFINALfragmentation flags plusFRAG_V2metadata. - When
FRAG_V2is set, each fragment includesFrag Offset+Frag Total Len. FINALMUST be set on the fragment whereoffset + len == total_len.- Receivers MUST reassemble fragments keyed by
(sender, request_id)and reject overlaps. - Fragments with the same key MUST agree on
msg_type,nonce, and cleared flags (and total len when present); mismatches cause the in-progress reassembly to be dropped. - Implementations SHOULD bound in-flight reassembly and drop incomplete entries after a short timeout to avoid unbounded memory growth.
- Capability negotiation: responders MAY emit
FRAG_V2fragments only when the requester advertisesFRAG_V2support; otherwise they fall back to legacyCONTINUED/FINALfragments.
6. Resource Types
| Code | Type | Capacity Unit | Notes |
|---|---|---|---|
| 0x0001 | CPU | Core count | Advertised only; no remote exec |
| 0x0002 | MEM | Bytes | RDMA-accessible memory pool |
| 0x0003 | GPU | Compute units | Vendor fabric / RDMA-based |
| 0x0004 | NVME | Bytes | Block storage via NVMe-oF |
| 0x0005 | FPGA | Slots | Vendor-defined binding |
| 0x0006 | PMEM | Bytes | Persistent memory pool |
| 0x0007 | CXL_MEM | Bytes | CXL memory pooling/switch binding (extension) |
| 0x00FF | VENDOR | Vendor-defined | Extension space |
7. Trust and Identity
7.1 Node Identity
NodeIdentity:
node_id : u128public_key : 32B (Ed25519)certificate : CertificateCertificate:
subject : u128 (node_id)issuer : u128 (CA id)issued_at : u64expires_at : u64public_key : 32Bextensions : ExtensionTLV[]signature : 64B (Ed25519 by CA)Canonical identifier formats:
- Hex string:
0x+ 32 lowercase hex digits (zero-padded). - URN string:
urn:fabricbios:node:0x<32-hex>. - Tooling MAY accept bare hex input, but MUST normalize to the canonical
0xform for display.
TLS identity mapping (all profiles using TLS/mTLS):
- The peer certificate SAN URI MUST include
urn:fabricbios:node:0x<32-hex>. - Implementations MUST fail closed if the SAN URI is missing or ambiguous.
7.2 Controller Discovery and Trust Bootstrap
Enrollment requires locating and trusting the fabric controller before a fabric certificate exists. Operators choose one:
A) Pinned Fabric CA (recommended):
- Node firmware contains fabric CA public key (or hash).
- Controller presents a chain anchored at that CA.
B) Pinned Controller Key (recommended for small fabrics):
- Node firmware contains controller public key (or hash).
C) TOFU (allowed only if explicitly enabled):
- Node pins controller key on first contact.
- Subsequent enrollment requires same key.
- Operators SHOULD disable TOFU in high-security deployments.
Controller anycast address may be configured, e.g. fd00:FABRIC::ffff:2, or provided by DHCPv6 option/local config.
7.3 Fabric Enrollment
- Node boots with manufacturer identity.
- Node contacts controller, verifies identity per §7.2.
- Controller verifies manufacturer identity and optional attestation.
- Controller issues fabric certificate binding
node_idtopublic_key. - Node signs ANNOUNCE/WITHDRAW/REVOKE_BROADCAST and control responses using fabric identity.
7.4 Attestation (Optional)
Attestation is carried as a TLV:
type : u8 (0=none, 1=TPM2, 2=SGX, 3=SEV, 4=TDX)evidence : bytesHigher layers may impose policy using attestation evidence.
8. Control Plane Transports and Profiles
Lease management operations require an authenticated reliable transport. Default policy (2026-02-19): QUIC/TLS 1.3 is the standard control-plane transport for general-purpose nodes (including Pi-class bare metal and Linux nodes). Legacy profiles remain documented for migration only.
8.1 Profile FULL (normative default)
- Discovery: UDP 5700
- Control + lease ops: QUIC 5701 (TLS 1.3, mTLS)
8.2 Profile COMPAT (legacy migration only)
- Discovery: UDP 5700
- Control + lease ops: TCP 5701 + TLS 1.3 (mTLS)
- Status: retained only for migration/backward compatibility; SHOULD NOT be used for new deployments.
8.3 Profile PROXIED (constrained bring-up only)
- Firmware supports discovery signing + a minimal local interface to a trusted on-node proxy.
- Proxy terminates QUIC/TLS or TCP/TLS and performs lease ops + device programming.
- Status: implementation aid for constrained bring-up; SHOULD NOT be the steady-state profile for Pi/Linux nodes.
8.4 Normative Requirement
Lease management operations (ALLOC/BIND/RENEW/FREE) MUST use an authenticated reliable transport, satisfied by:
- QUIC/TLS 1.3, or
- TCP/TLS 1.3, or
- a trusted proxy that provides one of the above.
General-purpose nodes SHOULD implement QUIC/TLS 1.3 directly.
UDP is acceptable for discovery and limited low-risk control only.
8.5 Trusted-Fabric Exception Profile (non-default)
This is an explicit exception profile for physically isolated trusted segments under a single administrative domain. It does not replace the normative default in §8.1-§8.4. Operators MUST opt in deliberately and document the accepted risks.
Allowed relaxations in this exception profile are intentionally narrow:
- Control + lease ops MAY omit TLS only on the trusted segment, using the same transport framing and lease semantics, when the operator provides equivalent physical/network isolation.
- ANNOUNCE and SOLICIT MAY omit signature requirements on that trusted segment.
- Local east-west firewalling between fabric nodes MAY be relaxed when the segment itself is dedicated and isolated.
The following remain mandatory even in this exception profile:
- capability token validation,
- lease TTL enforcement,
- revoke/expiry teardown,
- fencing on teardown failure,
- audit logging,
- anti-replay for compatibility dataplanes where it exists.
WITHDRAW, REVOKE_BROADCAST, and the general secure default for Pi/Linux-class nodes remain governed by the normative posture above unless an operator has explicitly enabled and documented the trusted-fabric exception.
9. Capability Tokens
9.1 Token Structure
version : u8token_id : u128resource_id : u128audience : u128 (node_id or service_id)permissions : u32issued_at : u64expires_at : u64 (default max TTL 300s)issuer : u128 (issuer node_id)caveats : CaveatTLV[]signature : 64B (Ed25519 by issuer)9.2 Audience Binding (Normative)
A token is valid only when presented by its audience.
- Over QUIC/TLS or TCP/TLS: audience binding is satisfied by the authenticated peer identity mapped to
node_id/service_id. - Over UDP: REQUEST MUST include presenter proof (see §11.2).
audience = 0 indicates a bearer token and MUST be restricted to short TTL and SHOULD include SOURCE_IP caveats.
9.3 Permissions
| Bit | Name |
|---|---|
| 0 | READ |
| 1 | WRITE |
| 2 | ADMIN |
| 3 | DELEGATE |
| 4 | EXCLUSIVE |
Reserved bits MUST be zero.
9.4 Caveats (TLV)
Caveat TLV:
type : u8length : u16data : bytesCaveat types:
- TIME_BOUND
- SOURCE_IP
- RANGE (offset/length for MEM/NVME)
- RATE_LIMIT (safety limit; MAY be enforced by fabricBIOS)
- DEPTH (max delegation depth)
- AUDIENCE (additional audience constraint)
9.5 Delegation (Attenuation)
Derived tokens can only restrict:
- add caveats
- narrow permissions
- set a new audience
Verifier checks parent validity, DELEGATE permission, delegator identity, and restriction-only semantics.
9.6 Token Revocation
- Tokens SHOULD have short TTL (default max 300s).
- Immediate revocation can be broadcast:
REVOKE_BROADCAST Payload:
issuer : u128token_ids : u128[]until : u64Receivers cache revocations until until and reject revoked token_ids.
10. Lease Management
10.1 Lease Model
Any data-plane binding (e.g., LEASE_ALLOC, NVME_BIND, GPU_BIND, CXL_BIND) creates a lease.
A token authorizes control operations; a lease governs data-plane lifetime.
Lease:
lease_id : u128resource_id : u128holder : u128 (node_id/service_id)granted_at : u64expires_at : u64binding : DataPlaneBinding10.2 Timing Defaults
- duration: 60s (range 10s–3600s)
- renewal window: last 20%
- grace: 10s (range 0–60s)
10.3 Teardown (Normative)
On expiry or revoke:
- fabricBIOS MUST tear down the data-plane authorization such that subsequent data-plane operations fail.
Examples:
- RDMA: invalidate/rotate rkey and/or destroy/poison QP; deregister memory
- NVMe-oF: disconnect controller session; revoke auth material
- GPU fabric: revoke endpoint/session credentials
- CXL: remove/disable mapping window or decoder rule (per platform)
10.4 Teardown Failure: FENCED State
If teardown fails or hardware enters an unsafe state, fabricBIOS MUST fence the resource:
- No new leases are granted for that resource.
- Existing leases remain invalid at the control plane.
- Resource is reported as FENCED/DEGRADED in discovery (ResourceSummary flags).
- fabricBIOS SHOULD attempt remediation (reset session/device) if supported.
Recommended status code:
RESOURCE_FENCED— resource is fenced due to teardown failure or hardware fault.
11. Control Plane Operations
11.1 Message Types and Ports
- UDP 5700: discovery + limited control
- QUIC 5701: control + lease management (required default)
- TCP 5701: legacy migration path only (optional)
11.2 REQUEST/RESPONSE Payloads
Normative wire encoding for REQUEST/RESPONSE is defined in docs/spec/fabricbios-wire-encoding-v0.md. The summary below is informative.
REQUEST Payload:
op : u16resource_id : u128token : CapabilityTokenparams : bytespresenter_id : u128 (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)presenter_sig : 64B (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)On UDP, presenter_sig proves possession of presenter_id private key and binds the request to the token. The signature MUST cover the on-wire REQUEST payload with presenter_sig zeroed. On QUIC, the TLS-authenticated peer identity satisfies the same binding requirement.
RESPONSE Payload:
status : u16op : u16 (echo)result : bytes11.3 Status Codes
| Code | Name |
|---|---|
| 0x0000 | OK |
| 0x0001 | INVALID_TOKEN |
| 0x0002 | INSUFFICIENT_PERM |
| 0x0003 | RESOURCE_NOT_FOUND |
| 0x0004 | RESOURCE_BUSY |
| 0x0005 | CAPACITY_EXCEEDED |
| 0x0006 | LEASE_EXPIRED |
| 0x0007 | RATE_LIMITED |
| 0x0008 | RESOURCE_FENCED |
| 0x00FF | INTERNAL_ERROR |
11.4 Operations
Node-level (resource_id = 0):
PING→ uptimeGET_INVENTORY→ full inventory (chunked as needed)
Capability:
CAP_REQUEST→ mint token with perms, duration, audience, caveatsCAP_REFRESH→ refresh token expiryCAP_REVOKE→ revoke token_id
Leases (illustrative; resource-specific):
LEASE_ALLOC,LEASE_FREE,LEASE_RENEW,LEASE_QUERYNVME_BIND,NVME_UNBIND,NVME_RENEWGPU_BIND,GPU_UNBIND,GPU_RENEWCXL_BIND,CXL_UNBIND,CXL_RENEW(extension)
12. Data Plane Bindings
fabricBIOS returns binding credentials and endpoint descriptors. It does not implement bulk transfer.
12.1 RDMA Binding (Example)
transport : u8gid : 16Bqp_type : u8qp_num : u32psn : u32mtu : u16rkey : u32remote_addr : u64length : u64vendor_data : bytes (opaque)remote_addr is a remote address in the RNIC registration context, not a CPU virtual address.
12.2 NVMe-oF Binding (Example)
transport : u8address : 16B IPv6port : u16nqn : bytescontroller_id : u16namespace_id : u32auth_key : bytes (optional)12.3 GPU Binding (Vendor)
Vendor protocol endpoint + metadata blob.
12.4 CXL Binding (Extension)
A CXL binding MAY include:
- switch/port identifiers
- mapping window identifiers
- decoder configuration references
- required authentication material (if any)
Exact format is platform-specific until standardized.
13. Security Model
13.1 Threats and Mitigations
- Spoofed ANNOUNCE/WITHDRAW → signed + CA verification
- Token forgery → Ed25519 signatures
- Replay across principals → audience binding
- Replay in time → nonce checks + replay caches
- Stale data-plane access → lease expiry + teardown
- MITM control → authenticated reliable transport for leases
- DoS unauth → rate limiting + signature-before-decompress
13.2 Mandatory Requirements
A conforming implementation MUST:
- Sign ANNOUNCE/WITHDRAW/REVOKE_BROADCAST
- Verify tokens, audience binding, and expiry on control ops
- Verify signatures before decompression
- Enforce nonce validity and replay cache behavior
- Enforce lease expiry teardown; fence on teardown failure
- Support one of the secure control profiles (FULL is the normative default; COMPAT/PROXIED only where constrained)
14. Discovery Scaling Guidance
- Small L2 fabrics (≤100 nodes): link-local multicast may be sufficient.
- Routed fabrics (typical leaf-spine): relays are strongly recommended and often required.
- Large fabrics (1000+): deploy relays (often per rack/ToR), aggregate, and answer SOLICIT; do not rely on multicast as primary.
15. Relationship to Operating System (Compute Scheduling Across Nodes)
fabricBIOS is designed to support an OS that schedules compute across nodes by making CPU capacity/topology and locality discoverable and by providing secure discovery/trust/bootstrap so the OS can deploy its own compute control plane.
A composed OS (e.g., a resource-graph OS) can:
- discover CPUs and locality
- deploy OS agents to nodes
- schedule work across nodes using its own execution model
- bind remote memory/storage/accelerators via leases
- react to withdrawals and fenced resources deterministically
16. Implementation Guidance
16.1 Size Expectations (Typical)
| Platform | Typical Size |
|---|---|
| Reference daemon | 1–5 MB |
| DPU firmware | 200 KB–1 MB |
| Minimal embedded | 100–500 KB |
16.2 Deployment Targets
Linux daemon (dev/test), DPU (primary), BMC (FULL preferred; PROXIED where constrained), FPGA (research).
16.3 MVP Milestones
- ANNOUNCE/SOLICIT + relay
- CAP_REQUEST/CAP_REFRESH
- LEASE_ALLOC with RDMA binding
- Lease renewal + expiry teardown + fencing on failure
- NVME_BIND next
Appendix A: Constants
Ports
- 5700/UDP: discovery + limited control
- 5701/QUIC: secure control (FULL profile, normative default)
- 5701/TCP: legacy migration only (COMPAT profile)
Multicast Groups
ff02::6662:696f:0001(link-local)ff05::6662:696f:0001(site-local)ff08::6662:696f:0001(org-local)
Resource Types
- CPU 0x0001
- MEM 0x0002
- GPU 0x0003
- NVME 0x0004
- FPGA 0x0005
- PMEM 0x0006
- CXL_MEM 0x0007
- VENDOR 0x00FF