Skip to content

fabricBIOS Specification v1.1

Note: v1.1 supersedes v0 (fabricBIOS-design-document.md). The v0 document is preserved for historical reference only.

Abstract

fabricBIOS is a minimal firmware-level control-plane specification for disaggregated computing fabrics. It enables nodes to:

  • advertise hardware resources and locality,
  • establish and maintain fabric trust,
  • mint and validate capability tokens,
  • create lease-bound bindings to existing data planes (e.g., RDMA, NVMe-oF, vendor accelerator fabrics).

fabricBIOS is not an operating system. It exposes resources and enforces safety-critical limits (authorization, anti-replay, bounded parsing, lease expiry, mandatory teardown, fencing). Scheduling, placement, fairness, paging, and global optimization live above this layer.


1. Layering and Design Principles

1.1 Layering model (informative)

A useful mental model is three layers:

  1. Physical / Fabric substrate (fabricBIOS)
    Identity, discovery, capabilities, leases, teardown, fencing.
  2. Fabric control and policy (higher layer)
    Placement, scheduling, admission control, optimization, tenant policy, economic policy.
  3. Workload runtime
    Application logic, scaling decisions, traffic shaping, state management.

fabricBIOS exists to make the substrate explicit, verifiable, and bounded, so higher layers can safely and deterministically program disaggregated resources.

1.2 Design principles (normative intent)

  1. Minimal trusted computing base: small enough to audit and run in firmware/DPU, or behind a small trusted proxy.
  2. Mechanism, not policy: provides primitives (capabilities, leases, teardown, fencing), not scheduling/placement policy.
  3. Leverage existing data planes: authorize/bind; do not reimplement bulk transfer.
  4. Node-addressed, resource-identified: nodes have stable identities; resources have stable IDs.
  5. Strong security: signatures, audience binding, anti-replay, bounded parsing, least privilege.
  6. Lease-oriented: bindings have explicit lifetime, renewal, revocation, and mandatory teardown.
  7. Interop-first: unambiguous wire format, byte order, signing rules, fragmentation behavior, replay handling.
  8. Profiles for feasibility: multiple secure transport profiles accommodate DPUs and minimal endpoints.

2. Conformance, Terminology, and Document Conventions

2.1 Normative language

The key words MUST, MUST NOT, REQUIRED, SHALL, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL are to be interpreted as described in RFC 2119.

2.2 Normative vs informative sections

Unless explicitly labeled (informative), all requirements in this specification are normative.

2.3 Conformance levels

An implementation is conformant if it satisfies all MUST requirements in:

  • Wire protocol (§5),
  • Discovery (§6),
  • Trust and enrollment (§8),
  • Secure transports and profiles (§9),
  • Capability tokens (§10),
  • Lease management (§11),
  • Control plane operations (§12),
  • Security requirements (§14).

Discovery-only “listeners” that do not implement leases are permitted for monitoring, but they are not considered conformant fabricBIOS nodes.


3. Scope

3.1 Responsibilities

ResponsibilityDescription
DiscoveryAdvertise node identity, locality, and resource inventory
Trust bootstrapEstablish fabric trust and maintain identity
Capability exchangeMint, attenuate, validate, and revoke capability tokens
Lease managementCreate, renew, revoke leases; enforce expiry
Data-plane bindingProvide endpoints/credentials for existing data planes
Safety enforcementRate-limit unauthenticated traffic; bound parsing; enforce teardown

3.2 Non-responsibilities

Not fabricBIOSWhy
CPU execution/schedulingPolicy; requires isolation, preemption, accounting
Memory pagingPolicy; higher layers decide caching/placement
Filesystem semanticsPolicy; fabricBIOS exposes block endpoints
QoS fairnessPolicy; fabricBIOS may enforce safety limits only
Process isolationOS/hypervisor concern
Global optimizationHigher-layer concern

4. Addressing and Identifiers

4.1 Node addressing

Each node exposes a primary IPv6 address for the fabricBIOS control endpoint. Operators typically allocate a ULA prefix and assign nodes a /64; the node’s fabricBIOS endpoint is commonly ::1.

4.2 Resource identification

Resources are identified by 128-bit UUIDs carried in protocol payloads.

  • UUIDs are transmitted as 16 bytes in canonical RFC 4122 byte order.
  • A structured UUID layout is RECOMMENDED but not required.

5. Wire Protocol

5.1 Integer encoding and endianness

All multi-byte integer fields are big-endian (network byte order).

5.2 Canonical variable-length encodings

These encodings are normative:

  • bytes := u32 len + len bytes (reject if len exceeds implementation limit).
  • list := u16 count + repeated items (reject if count exceeds implementation limit).
  • tlv := u8 type + u16 len + len bytes.
    • Unknown TLV types MUST be skippable (unknown TLVs are not fatal unless explicitly stated).
  • Optional fields use a u8 present flag (0=absent, 1=present) followed by the field bytes.

5.3 Common message header

All fabricBIOS messages share a common header.

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Msg Type | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Request ID (64) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Nonce (64) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| [Frag Offset (32)] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| [Frag Total Len (32)] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Signature (Ed25519, 64B) |
| (if SIGNED flag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Field definitions:

FieldSizeDescription
Version8 bitsProtocol version. This specification defines 0x01.
Msg Type8 bitsMessage type (§5.4).
Flags16 bitsFlags (§5.5).
Payload Length32 bitsPayload bytes only (excludes signature).
Request ID64 bitsCorrelation ID; responses echo this.
Nonce64 bitsAnti-replay; semantics depend on flags (§5.8).
Frag Offset32 bitsPresent only if FRAG_V2 is set: byte offset within full payload.
Frag Total Len32 bitsPresent only if FRAG_V2 is set: total unfragmented payload size.
PayloadvarType-specific payload.
Signature64BEd25519 signature if SIGNED is set.

Fail-closed requirements:

  • Receivers MUST reject unknown protocol versions.
  • Receivers MUST reject messages with any unknown/reserved flag bit set.
  • Receivers MUST reject messages whose declared lengths exceed configured bounds.

5.4 Message types

CodeNameDirectionSigned
0x01ANNOUNCENode → FabricREQUIRED
0x02SOLICITClient → Fabric/RelayOPTIONAL (RECOMMENDED outside a single trusted L2)
0x03WITHDRAWNode → FabricREQUIRED
0x10REQUESTClient → NodeREQUIRED for any operation beyond discovery
0x11RESPONSENode/Relay → ClientREQUIRED
0x20REVOKE_BROADCASTNode → FabricREQUIRED

5.5 Flags

BitNameMeaning
0SIGNEDSignature trailer present
1COMPRESSEDPayload is zstd-compressed
2CONTINUEDFragmented message: more fragments follow
3FINALFragmented message: last fragment
4NONCE_IS_TIMESTAMPNonce is UNIX seconds (u64). If unset, Nonce is random u64
5FRAG_V2Fragment metadata present (offset + total length)
6–15ReservedMUST be zero (receivers MUST reject if non-zero)

5.6 Signature and compression rules

  1. If COMPRESSED is set, payload bytes are compressed before signing.
  2. Receivers MUST verify the signature before attempting decompression or deep parsing.
  3. The signature is computed over the exact on-wire bytes of the header fields (including any fragment metadata fields when present) plus payload, excluding the signature bytes.
  4. If fragmented, each fragment is signed independently over that fragment’s header+payload.

5.7 Fragmentation and reassembly

Fragmentation is indicated by CONTINUED and FINAL and requires FRAG_V2 metadata.

Sending rules:

  • If a message is fragmented, FRAG_V2 MUST be set on every fragment.
  • Frag Offset and Frag Total Len MUST be consistent across all fragments for the same reassembly key.
  • FINAL MUST be set on the fragment where frag_offset + payload_len == total_len.
  • CONTINUED MUST be set on every fragment except the final fragment.

Reassembly rules:

  • Receivers MUST reassemble fragments keyed by (sender_identity, request_id).
  • All fragments for a key MUST agree on: version, msg_type, nonce, and frag_total_len. Any mismatch MUST cause the in-progress reassembly to be dropped.
  • Receivers MUST reject overlapping fragments.
  • Receivers MUST bound in-flight reassembly by count and memory, and drop incomplete entries after a short timeout.

Required bounds (defaults):

  • MAX_INFLIGHT_REASSEMBLIES (default 1024)
  • MAX_REASSEMBLY_BYTES per entry (default 1 MiB)
  • REASSEMBLY_TIMEOUT_SEC (default 2–5 seconds for UDP discovery workloads)

5.8 Anti-replay and nonce handling

fabricBIOS supports two nonce modes:

Timestamp mode (NONCE_IS_TIMESTAMP=1):

  • Nonce is UNIX seconds.
  • Receiver MUST reject messages outside an allowed skew window (default ±300s).
  • Receiver SHOULD keep a replay cache keyed by (sender_identity, request_id, nonce) within the skew window.

Random mode (NONCE_IS_TIMESTAMP=0):

  • Nonce is random u64.
  • Receiver MUST maintain a bounded replay cache per sender for at least REPLAY_WINDOW_SEC (default 300s) or MAX_NONCES entries, evicting oldest.

5.9 Mandatory anti-DoS requirements

Implementations MUST:

  1. Rate-limit unsigned messages (default ≤10/sec per source IP).
  2. Perform cheap sanity checks (version, flags, length bounds) before any expensive work.
  3. Verify signature before decompression or deep parsing.
  4. Drop malformed headers without payload processing.
  5. Enforce maximum payload length limits for each message class.

6. Discovery

6.1 Transport

Discovery uses UDP port 5700.

6.2 Multicast groups

Well-known IPv6 multicast group IDs:

  • Link-local: ff02::6662:696f:0001
  • Site-local: ff05::6662:696f:0001
  • Organization-local: ff08::6662:696f:0001

6.3 ANNOUNCE

ANNOUNCE is sent on boot, periodically (default 30s), and on material resource change. ANNOUNCE MUST be signed.

ANNOUNCE payload:

node_id : u128
node_addr : 16B IPv6
fabric_id : u64
sequence : u64 (monotonic per node)
locality : LocalityInfo
attestation : AttestationTLV (optional)
resources : ResourceSummary[]
features : u32 (optional; presence-gated)

LocalityInfo:

rack_id : u32
row_id : u32
site_id : u32
geo_hash : u64 (optional; presence-gated)
custom : 32B operator-defined

Feature advertisement (optional): If present, features is a bitmask advertising protocol feature support (e.g., compression, fragmentation requirements). Unknown bits are ignored by receivers.

6.4 WITHDRAW

WITHDRAW announces node departure or permanent unavailability. It MUST be signed.

WITHDRAW payload:

node_id : u128
sequence : u64
reason : u16

6.5 SOLICIT

SOLICIT queries for resources. Signing SOLICIT is RECOMMENDED outside a single trusted L2 domain.

SOLICIT payload:

query_type : u8 (0=all, 1=by_type, 2=by_node, 3=by_locality)
filters : Filter[]

Filter:

field : u8
op : u8 (EQ, GT, LT, CONTAINS)
value : 32B

6.5.1 Filter field registry

Filter field IDs are u8 with ranges:

  • 0x00 reserved
  • 0x01..=0x3F core registry
  • 0x40..=0x7F experimental/extension
  • 0x80..=0xFF vendor-specific

Core fields:

FieldNameValue encodingOps
0x01RESOURCE_TYPEvalue[0..2] = u16 BEEQ
0x02NODE_IDvalue[0..16] = u128 BEEQ
0x03SITE_IDvalue[0..4] = u32 BEEQ, GT, LT
0x04ROW_IDvalue[0..4] = u32 BEEQ, GT, LT
0x05RACK_IDvalue[0..4] = u32 BEEQ, GT, LT
0x06LOCALITY_CUSTOMvalue[0..32] opaqueEQ, CONTAINS
0x07RESOURCE_FLAGSvalue[0..2] = u16 BE bitmaskEQ, CONTAINS

CONTAINS for LOCALITY_CUSTOM uses non-zero bytes as required matches. RESOURCE_FLAGS uses a bitmask; CONTAINS requires all bits set.

6.6 Discovery relay

In routed fabrics, multicast may be unavailable or unreliable; relays are commonly required.

  • Nodes periodically unicast ANNOUNCE to the relay.
  • Clients unicast SOLICIT to the relay.
  • The relay responds with RESPONSE messages containing aggregated ANNOUNCE payloads (may be chunked).

6.6.1 Relay discovery profile (RESPONSE payload)

Relays encode inventory responses using MsgType::RESPONSE with this fixed payload layout:

count : u16
repeated count times:
announce_payload_bytes : bytes (u32 length + bytes of ANNOUNCE payload encoding)

Rules:

  • Each list entry MUST be a complete ANNOUNCE payload encoded per this specification.
  • Relays MAY split inventory across multiple RESPONSE frames to honor MTU guidance (§6.7).
  • Clients MUST treat the list as an unordered snapshot.

6.6.2 Relay verification requirements

  • Relays MUST enforce nonce/replay protections.
  • If configured with trust material, relays MUST reject unsigned or invalidly signed frames.

6.6.3 Relay limits and defaults

Relays MUST bound caches and rate-limit unauthenticated traffic. Recommended defaults:

  • max_nodes_cached=4096
  • max_inventory_bytes=1MiB
  • replay_max_entries=4096
  • unsigned_per_sender_per_sec=10
  • sig_verifies_per_sec=2000

6.7 Chunking and MTU guidance

To avoid IP fragmentation, implementations SHOULD limit UDP payloads to ≤1200 bytes. For larger inventories/responses:

  • Use fragmentation flags and FRAG_V2 metadata (§5.7).
  • Receivers MUST apply bounded reassembly behavior (§5.7).

7. Resource Model

7.1 ResourceSummary

resource_id : u128
type : u16
flags : u16
capacity : u64
available : u64
descriptors : DescriptorTLV[]
endpoints : EndpointTLV[] (optional; may be omitted and fetched via GET_INVENTORY)

7.2 Resource types

CodeTypeCapacity unitNotes
0x0001CPUCore countAdvertised only; no remote exec defined here
0x0002MEMBytesRDMA-accessible memory pool
0x0003GPUCompute unitsVendor fabric / RDMA-based
0x0004NVMEBytesBlock storage via NVMe-oF
0x0005FPGASlotsVendor-defined binding
0x0006PMEMBytesPersistent memory pool
0x0007CXL_MEMBytesCXL memory pooling/switch binding (extension)
0x00FFVENDORVendor-definedExtension space

7.3 Resource flags

Resource flags are a u16 bitmask. Core flags and meanings are defined in Appendix B (placeholder).


8. Trust, Identity, and Enrollment

8.1 Node identity

NodeIdentity:

node_id : u128
public_key : 32B (Ed25519)
certificate : Certificate (optional until enrolled)

Certificate:

subject : u128 (node_id)
issuer : u128 (CA id)
issued_at : u64
expires_at : u64
public_key : 32B
extensions : ExtensionTLV[]
signature : 64B (Ed25519 by issuer)

Canonical identifier string forms:

  • Hex string: 0x + 32 lowercase hex digits (zero-padded).
  • URN string: urn:fabricbios:node:0x<32-hex>.

8.2 Trust anchors and controller discovery

Enrollment requires locating and trusting the fabric controller before a fabric certificate exists. Operators choose one:

A) Pinned Fabric CA (RECOMMENDED): node firmware contains fabric CA public key (or hash).
B) Pinned Controller Key (RECOMMENDED for small fabrics): node firmware contains controller public key (or hash).
C) TOFU (OPTIONAL, must be explicitly enabled): node pins controller key on first contact; subsequent enrollment requires same key.

8.3 Enrollment state machine

Nodes implement the following states:

  • UNENROLLED: no fabric identity; may emit limited discovery with manufacturer identity only if allowed by policy.
  • ENROLLING: establishing trust with controller; generating or presenting node keys; requesting certificate.
  • ENROLLED: holds valid fabric certificate and uses it for all signed messages.
  • LOCKED: refuses enrollment changes and key material changes except via explicit local operator action.

Required behavior:

  • Nodes MUST NOT accept enrollment material from an unauthenticated controller per the configured trust anchor.
  • Nodes MUST fail closed if controller identity is missing or ambiguous under the chosen trust mode.
  • In LOCKED, nodes MUST reject remote attempts to reset, rotate, or re-enroll.

8.4 Enrollment protocol (minimum)

Enrollment is realized as authenticated control-plane operations over a secure transport (§9). At minimum, the controller MUST support:

  • issuing a certificate binding node_id to public_key,
  • renewing/replacing certificates,
  • revoking certificates (operator-driven).

Exact opcodes and payloads are listed in Appendix B (placeholders included).

8.5 Attestation (optional)

Attestation is carried as a TLV:

type : u8
evidence : bytes

Attestation type codes are listed in Appendix B (placeholders included).


9. Secure Transports and Profiles

Lease management operations require an authenticated reliable transport. Default policy (2026-02-19): QUIC/TLS 1.3 is the standard control-plane transport for general-purpose nodes (including Pi-class bare metal and Linux nodes). Legacy profiles remain documented for migration only.

9.1 Profile FULL

  • Discovery: UDP 5700
  • Control + lease ops: QUIC on 5701 with TLS 1.3 (mutual authentication)

Requirements:

  • TLS version MUST be 1.3.
  • Peer identity MUST map to a fabric node/service identity (e.g., SAN URI form urn:fabricbios:node:... or equivalent mapping defined by the deployment).
  • Implementations SHOULD support session resumption and 0-RTT only if anti-replay is preserved for idempotent operations.

9.2 Profile COMPAT (legacy migration only)

  • Discovery: UDP 5700
  • Control + lease ops: TCP 5701 with TLS 1.3 (mutual authentication)
  • Status: retained only for migration/backward compatibility; SHOULD NOT be used for new deployments.

Requirements: same as FULL, except transport is TCP+TLS.

9.3 Profile PROXIED (constrained bring-up only)

  • Firmware supports signed discovery and a minimal local interface to a trusted on-node proxy.
  • Proxy terminates QUIC/TLS or TCP/TLS and performs lease ops and device programming.
  • Status: implementation aid for constrained bring-up; SHOULD NOT be the steady-state profile for Pi/Linux nodes.

Requirements:

  • The proxy MUST be within the trusted computing boundary and MUST enforce the same token, lease, and teardown rules as a native implementation.
  • The proxy MUST preserve auditability (log identity, token IDs, lease IDs, and op results).

9.4 Normative requirement

ALLOC/BIND/RENEW/FREE and explicit lease revocation operations MUST use an authenticated reliable transport, satisfied by FULL, COMPAT, or PROXIED.

General-purpose nodes SHOULD implement FULL directly.

UDP is acceptable only for discovery and strictly limited, low-risk control.


10. Capability Tokens

10.1 Token structure

version : u8
token_id : u128
resource_id : u128
audience : u128 (node_id or service_id)
permissions : u32
issued_at : u64
expires_at : u64 (default max TTL 300s)
issuer : u128 (issuer identity)
caveats : CaveatTLV[]
signature : 64B (Ed25519 by issuer)

10.2 Audience binding (required)

A token is valid only when presented by its audience.

  • Over QUIC/TLS or TCP/TLS: audience binding is satisfied by the authenticated peer identity.
  • Over UDP: REQUEST MUST include presenter proof (§12.3).

audience = 0 indicates a bearer token and MUST be restricted to short TTL and SHOULD include SOURCE_IP caveats.

10.3 Permissions

BitName
0READ
1WRITE
2ADMIN
3DELEGATE
4EXCLUSIVE

Reserved bits MUST be zero.

10.4 Caveats

Caveat TLV:

type : u8
length : u16
data : bytes

Caveat types and their exact numeric assignments are listed in Appendix B (placeholders included). The following caveat semantics are defined:

  • TIME_BOUND: restricts validity window within token lifetime.
  • SOURCE_IP: restricts source IP(s) allowed to present.
  • RANGE: restricts byte ranges for MEM/NVME.
  • RATE_LIMIT: safety limits a node MAY enforce.
  • DEPTH: maximum delegation depth.
  • AUDIENCE: additional audience constraint.

10.5 Delegation (attenuation)

Derived tokens can only restrict:

  • add caveats,
  • narrow permissions,
  • set a new audience.

Verifier checks parent validity, DELEGATE permission, delegator identity, and restriction-only semantics.

10.6 Token revocation

  • Tokens SHOULD have short TTL (default max 300s).
  • Immediate revocation can be broadcast:

REVOKE_BROADCAST payload:

issuer : u128
token_ids : u128[]
until : u64

Receivers cache revocations until until and reject revoked token_ids.

Relationship to explicit lease revocation: REVOKE_BROADCAST distributes token revocations best-effort; it is not a guaranteed immediate teardown mechanism for a specific lease. For deterministic “recall now” behavior, use explicit lease revocation operations (§11.5).


11. Lease Management

11.1 Lease model

Any data-plane binding creates a lease. A token authorizes control operations; a lease governs data-plane lifetime.

Lease:

lease_id : u128
resource_id : u128
holder : u128 (node_id/service_id)
granted_at : u64
expires_at : u64
binding : DataPlaneBinding

11.2 Timing defaults

  • duration: 60s (range 10s–3600s)
  • renewal window: last 20%
  • grace: 10s (range 0–60s)

11.3 Teardown (required)

On expiry or revoke:

  • fabricBIOS MUST tear down data-plane authorization such that subsequent data-plane operations fail.

Examples:

  • RDMA: invalidate/rotate keys and/or destroy/poison QPs; deregister memory
  • NVMe-oF: disconnect controller session; revoke auth material
  • Accelerator fabrics: revoke endpoint/session credentials
  • CXL: remove/disable mapping window or decoder rule (platform-specific)

11.4 Teardown failure and fenced state

If teardown fails or hardware enters an unsafe state, fabricBIOS MUST fence the resource:

  • No new leases are granted for that resource.
  • Existing leases remain invalid at the control plane.
  • Resource is reported as FENCED/DEGRADED in discovery.
  • Implementations SHOULD attempt remediation if supported.

11.5 Explicit lease revocation (emergency recall)

fabricBIOS supports early termination of an active lease prior to expiry. Two operations are defined:

  • LEASE_REVOKE: initiate teardown and return once teardown has been scheduled/initiated.
  • LEASE_REVOKE_SYNC: initiate teardown and block until teardown completes successfully, or the resource is fenced, or a caller-provided deadline is reached.

Normative guarantees:

  • After LEASE_REVOKE_SYNC returns OK, the node MUST guarantee the associated data-plane authorization has been revoked such that subsequent data-plane operations fail.
  • If teardown fails, the node MUST fence the resource and return RESOURCE_FENCED.
  • If teardown cannot be confirmed within the caller’s deadline, the node MUST either:
    • return TEARDOWN_TIMEOUT and guarantee teardown remains in progress with a bounded watchdog that will eventually transition the resource to either “torn down” or FENCED, or
    • fence immediately and return RESOURCE_FENCED.

Transport restriction:

  • LEASE_REVOKE and LEASE_REVOKE_SYNC MUST be supported only over authenticated reliable transports (§9). They MUST NOT be accepted over UDP.

Auditability:

  • Implementations MUST emit an audit record for explicit lease revocation including: actor identity, lease_id, resource_id (if known), outcome, and time-to-teardown.

12. Control Plane Operations

12.1 Transports and ports

  • UDP 5700: discovery + strictly limited control
  • QUIC 5701: control + lease management (required default)
  • TCP 5701: legacy migration path only (COMPAT profile)

Operations that create/renew/free/revoke leases MUST use the authenticated reliable transport profiles.

12.2 REQUEST / RESPONSE payloads

Normative wire encoding for REQUEST/RESPONSE is defined in docs/spec/fabricbios-wire-encoding-v0.md. The summary below is informative.

REQUEST payload:

op : u16
resource_id : u128
token : CapabilityToken
params : bytes
presenter_id : u128 (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)
presenter_sig : 64B (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)

RESPONSE payload:

status : u16
op : u16 (echo)
result : bytes

Notes:

  • For operations that act on a lease by lease_id (e.g., LEASE_REVOKE_SYNC), resource_id SHOULD be set to zero and MUST be ignored by the receiver.

12.3 Presenter proof on UDP (required for UDP requests)

On UDP, presenter_sig proves possession of presenter_id private key and binds the request to the token.

Canonical signing procedure (normative):

  1. Serialize the REQUEST payload exactly as it appears on wire.
  2. Set the 64 bytes of presenter_sig in that serialized payload to all-zero bytes.
  3. Compute presenter_sig = Ed25519.Sign(presenter_sk, serialized_request_payload_with_zeroed_sig).
  4. Insert presenter_sig into the payload and transmit.

Verifiers MUST repeat the same zeroing procedure and verify the signature with presenter_id’s public key.

12.4 Status codes

CodeName
0x0000OK
0x0001INVALID_TOKEN
0x0002INSUFFICIENT_PERM
0x0003RESOURCE_NOT_FOUND
0x0004RESOURCE_BUSY
0x0005CAPACITY_EXCEEDED
0x0006LEASE_EXPIRED
0x0007RATE_LIMITED
0x0008RESOURCE_FENCED
0x0009TEARDOWN_TIMEOUT
0x000ALEASE_NOT_FOUND
0x00FFINTERNAL_ERROR

TEARDOWN_TIMEOUT is only valid for LEASE_REVOKE_SYNC and MUST satisfy the guarantee in §11.5.

12.5 Operation model

Operations are identified by a u16 opcode. Exact opcodes and parameter/result encodings are listed in Appendix B (placeholders included).

At minimum, conformant nodes MUST support:

  • PING
  • GET_INVENTORY
  • token mint/refresh/revoke
  • resource-specific ALLOC/BIND/RENEW/FREE for supported resource types
  • LEASE_REVOKE and LEASE_REVOKE_SYNC

12.6 Explicit lease revocation operation payloads (normative encodings)

The following payload formats are normative. Opcode numeric values are assigned in Appendix B (placeholder).

12.6.1 LEASE_REVOKE params

lease_id : u128
reason : u16
flags : u16

Flags (bitmask):

  • bit 0: RETURN_BINDING_INFO (if set, include binding info fields in result when available)
  • bit 1: CANCEL_RENEWALS (if set, node MUST reject further renewals for this lease_id)
  • bits 2..15: reserved, MUST be zero

12.6.2 LEASE_REVOKE result

outcome : u8
resource_id_present : u8
resource_id : u128 (if present)
binding_info_present : u8
binding_kind : u16 (if present)
binding_id : u128 (if present)
reserved0 : u8 (must be 0; for alignment)

Outcome codes:

  • 0: REVOKED (teardown initiated; may still be in progress)
  • 1: ALREADY_EXPIRED (lease already expired; treated as success)
  • 2: NOT_FOUND (lease unknown; receiver MAY treat as success for idempotency)
  • 3: FENCED (resource fenced due to teardown failure)

12.6.3 LEASE_REVOKE_SYNC params

lease_id : u128
reason : u16
flags : u16
deadline_ms : u32
  • deadline_ms MUST be non-zero.
  • If deadline_ms exceeds implementation maximum, receiver MUST clamp to its maximum and proceed.

Flags are the same as LEASE_REVOKE.

12.6.4 LEASE_REVOKE_SYNC result

Same as LEASE_REVOKE result, except outcome code 0 (REVOKED) means teardown is complete and authorization is revoked.

If the operation cannot confirm teardown before the deadline, it MUST return TEARDOWN_TIMEOUT status and MAY include best-effort resource_id/binding info.


13. Data-Plane Bindings (examples)

fabricBIOS returns binding credentials and endpoint descriptors. It does not implement bulk transfer.

13.1 RDMA binding (example)

transport : u8
gid : 16B
qp_type : u8
qp_num : u32
psn : u32
mtu : u16
rkey : u32
remote_addr : u64
length : u64
vendor_data : bytes (opaque)

13.2 NVMe-oF binding (example)

transport : u8
address : 16B IPv6
port : u16
nqn : bytes
controller_id : u16
namespace_id : u32
auth_key : bytes (optional)

13.3 Accelerator binding (vendor)

Vendor protocol endpoint + metadata blob.

13.4 CXL binding (extension)

Platform-specific mapping and authentication material; exact format TBD by platform standards.


14. Security Considerations

14.1 Threats and mitigations

  • Spoofed discovery → signed frames and trust verification
  • Token forgery → Ed25519 signatures
  • Replay → nonce checks + bounded replay caches
  • Stale data-plane access → lease expiry + teardown + explicit revoke sync
  • MITM control plane → authenticated reliable transports for leases
  • Decompression bombs → verify signature before decompression + bounds
  • Memory abuse via fragmentation → bounded reassembly + overlap rejection
  • Control-plane recall abuse → rate-limit LEASE_REVOKE_SYNC and require ADMIN authorization

14.2 Mandatory requirements summary

A conformant implementation MUST:

  1. Sign ANNOUNCE/WITHDRAW/REVOKE_BROADCAST and all control responses.
  2. Verify tokens, audience binding, and expiry on control operations.
  3. Verify signatures before decompression and deep parsing.
  4. Enforce nonce validity and replay cache behavior.
  5. Enforce lease expiry teardown; fence on teardown failure.
  6. Support explicit lease revocation including a synchronous form (LEASE_REVOKE_SYNC) that provides deterministic teardown-or-fence behavior.
  7. Reject unknown versions and unknown/reserved flag bits.
  8. Implement one secure transport profile (FULL is the normative default; COMPAT/PROXIED only where constrained).

15. Operational and Performance Considerations (informative)

  • Bounded work on unauthenticated input is non-negotiable: length checks, rate limits, and early drops protect firmware-class devices.
  • Transition costs exist in real systems: renegotiating leases, cache refill, key rotation, and re-binding can dominate short timescales.
  • Measurement overhead (“observer effect”) can distort latency and throughput if telemetry is too intrusive; prefer low-overhead counters and sampling.
  • Tuning knobs commonly required: replay cache sizes, signature verify budgets, reassembly limits, inventory chunking policies, and explicit revoke rate limits.

To avoid interop drift, deployments SHOULD maintain:

  • Golden wire vectors for each message type (including fragmentation cases, compressed payloads, and invalid frames).
  • Negative test corpus (bad flags, bad versions, oversized lengths, replayed nonces, overlap fragments).
  • Fuzzing harness focusing on: header parsing, TLV parsing, reassembly, decompression, and token verification.
  • Conformance checklist aligned to §14.2.
  • Lease recall tests covering: LEASE_REVOKE_SYNC OK, LEASE_REVOKE_SYNC -> RESOURCE_FENCED, and LEASE_REVOKE_SYNC -> TEARDOWN_TIMEOUT.

Appendix A: Constants

Ports

  • 5700/UDP: discovery + limited control
  • 5701/QUIC: secure control (FULL profile, normative default)
  • 5701/TCP: legacy migration only (COMPAT profile)

Multicast groups

  • ff02::6662:696f:0001 (link-local)
  • ff05::6662:696f:0001 (site-local)
  • ff08::6662:696f:0001 (org-local)

Appendix B: Registries and Numeric Assignments (PLACEHOLDER — update before interoperability commitments)

This appendix is intentionally explicit about the remaining “registry work” and exact numeric assignments that must be finalized. The ranges and some assignments below are placeholders; finalize them and keep them stable.

B.1 TLV type ranges (u8)

Recommended ranges:

  • 0x00 reserved
  • 0x01..=0x3F core TLVs (this specification)
  • 0x40..=0x7F experimental/extension
  • 0x80..=0xFF vendor-specific

B.2 Caveat type registry (u8) — PLACEHOLDER

Type codeCaveat nameData encoding (normative once finalized)
0x01TIME_BOUNDu64 not_before + u64 not_after
0x02SOURCE_IPu8 family + addr bytes + optional prefix
0x03RANGEu64 offset + u64 length
0x04RATE_LIMITu32 units_per_sec + u32 burst
0x05DEPTHu8 max_depth
0x06AUDIENCEu128 required_audience
0x07..0x3F(reserved)

B.3 DescriptorTLV registry (u8) — PLACEHOLDER

DescriptorTLV entries appear in ResourceSummary.descriptors.

Type codeDescriptor nameData encoding (placeholder)
0x01NAMEUTF-8 string bytes
0x02MODELUTF-8 string bytes
0x03SERIALUTF-8 string bytes
0x04FW_VERSIONUTF-8 string bytes
0x05CAPABILITIESu64 bitmask (resource-type specific)
0x06..0x3F(reserved)

B.4 EndpointTLV registry (u8) — PLACEHOLDER

EndpointTLV entries appear in ResourceSummary.endpoints and/or may be returned via GET_INVENTORY.

Type codeEndpoint nameData encoding (placeholder)
0x01ENDPOINT_RDMARDMA endpoint blob (typed fields or opaque bytes)
0x02ENDPOINT_NVMENVMe-oF endpoint blob
0x03ENDPOINT_ACCELVendor endpoint blob
0x04ENDPOINT_CXLCXL endpoint blob
0x05ENDPOINT_OPAQUEOpaque bytes for vendor/private use
0x06..0x3F(reserved)

B.5 Certificate ExtensionTLV registry (u8) — PLACEHOLDER

Type codeExtension nameData encoding (placeholder)
0x01ROLEu32 role bitmask
0x02ALLOWED_PREFIXESlist of IPv6 prefixes
0x03HW_ATTEST_POLICYopaque bytes
0x04..0x3F(reserved)

B.6 Attestation type registry (u8) — PLACEHOLDER

Type codeAttestation type
0x00NONE
0x01TPM2
0x02SGX
0x03SEV
0x04TDX
0x05..0xFF(reserved)

B.7 Resource flag bits (u16) — PLACEHOLDER

BitNameMeaning
0FENCEDResource is fenced; no new leases granted
1DEGRADEDResource is degraded; higher layers should avoid if possible
2MAINTResource under maintenance
3..15(reserved)

B.8 Operation opcode registry (u16) — PLACEHOLDER

This table is a placeholder map. Finalize the opcodes and define parameter/result encodings per opcode.

OpcodeNameresource_idParams (placeholder)Result (placeholder)
0x0001PING0emptyu64 uptime_sec
0x0002GET_INVENTORY0optional filtersinventory bytes (chunkable)
0x0100CAP_REQUESTu128requested perms/ttl/caveats/audiencetoken bytes
0x0101CAP_REFRESHu128token_idtoken bytes
0x0102CAP_REVOKEu128token_idempty
0x0200LEASE_ALLOCu128size/constraintslease + RDMA binding
0x0201LEASE_FREEu128lease_idempty
0x0202LEASE_RENEWu128lease_id + ttlupdated lease
0x0300NVME_BINDu128size/constraintslease + NVMe binding
0x0301NVME_UNBINDu128lease_idempty
0x0302NVME_RENEWu128lease_id + ttlupdated lease
0x0400LEASE_REVOKE0§12.6.1§12.6.2
0x0401LEASE_REVOKE_SYNC0§12.6.3§12.6.4
0x0500..(reserved)

B.9 Enrollment opcodes (u16) — PLACEHOLDER

OpcodeNameParams (placeholder)Result (placeholder)
0x1000ENROLL_REQUESTCSR-like blobpending/nonce
0x1001ENROLL_ISSUEnode_id + pubkey + policycertificate bytes
0x1002ENROLL_ROTATEnew pubkey/CSRcertificate bytes
0x1003ENROLL_REVOKEnode_idempty
0x1004ENROLL_LOCKmodeempty
0x1005ENROLL_RESETreasonempty

Appendix C: Implementation Guidance (informative)

C.1 Size expectations (typical)

PlatformTypical size
Reference daemon1–5 MB
DPU firmware200 KB–1 MB
Minimal embedded100–500 KB

C.2 Deployment targets

Linux daemon (dev/test), DPU (primary), BMC (FULL preferred; PROXIED where constrained), embedded controllers (PROXIED).

C.3 Suggested bringup milestones

  1. ANNOUNCE/SOLICIT + relay discovery profile
  2. Token mint/refresh/revoke with one caveat type
  3. One lease type (e.g., LEASE_ALLOC) with correct teardown and fencing
  4. LEASE_REVOKE_SYNC behavior: success, timeout, and fenced outcomes
  5. Renewal + expiry teardown under fault injection
  6. Additional bindings (e.g., NVME_BIND), then vendor accelerator bindings