ADR-008: Mode Enforcement — Capability-Gated Tool Dispatch

Status: Accepted Date: 2026-05-06 Accepted: 2026-05-06 (R84) Round: R84 Supersedes: None Superseded by: None

Implementation lands in a follow-up task. This ADR is the architectural decision; the patch that wires the runtime to it is a separate dispatch in a later round. ADR-008 alone does not change runtime behavior — it commits Colibri to a path.

Context

A whole-system code review surfaced Finding #1: Phase 0 ships a four-mode runtime contract (FULL, READONLY, TEST, MINIMAL) and a fully-typed capability matrix, but no tool dispatch path consults the matrix. The runtime accepts COLIBRI_MODE=READONLY and reports the mode in server_health, yet a READONLY server happily executes task_update, thought_record, and merkle_finalize. A user who sets READONLY to safely inspect a frozen database is silently lied to.

Where the gap lives in code

src/modes.ts:174-185 defines capabilitiesFor(mode) — an exhaustive switch over the closed RuntimeMode union, returning frozen capability records (canWriteDatabase, canDispatchExternalIO, canRunIntegrationTests, canAcceptMCPConnections). The function is correct and ready to consult.

src/server.ts:229 calls it exactly once:

// Capability read — currently advisory (pure-mutex tool-lock per contract §2
// + T0 Q-1 decision). Kept as an explicit reference so the mode → capability
// dependency is compiled-in for future P0.2.3 / P0.4.2 wiring.
void capabilitiesFor(mode);

The return value is void-discarded. The 5-stage middleware chain composed inside registerColibriTool (src/server.ts:279-410) never reads ctx.mode or any capability field. Stage 1 (runWithToolLock, lines 412-437) is a per-tool mutex keyed on tool name only.

Where the gap lives in docs

docs/2-plugin/modes.md:31 reads:

The tool-lock stage of the α chain (stage 1 of tool-lock → schema-validate → audit-enter → dispatch → audit-exit) consults the active mode. A request whose tool is not in the admitted set is rejected with a typed ToolNotAdmittedError at the lock stage, before schema validation, dispatch, or auditing run.

docs/reference/mcp-tools-phase-0.md:930 describes server_health as returning the active mode and lists ERR_INVALID_MODE (line 988) as a documented error code; no tool currently emits it.

ADR-004:68 (the server_health row) implies mode is part of the operating contract.

docs/audits/p0-2-1-mcp-server-audit.md:122 and docs/audits/r82-f-deploy-boot-audit.md:44 already flagged the contradiction at audit time, with the original deferral noting capability gating “should happen at tool registration time, in P0.4.2 or P0.2.3”. Neither path was ever taken; capability gating disappeared and was never ADR’d. This ADR addresses that.

The full inventory and code citations are in docs/audits/adr-008-mode-enforcement-audit.md.

Decision

Adopt Option A: enforcement via a per-tool mutates config flag, checked at dispatch.

The mechanism in five points:

New optional field readonly mutates?: boolean on ColibriToolConfig in src/server.ts. Each tool declares its mutation posture at the registration site.
Default is false — a tool that does not declare itself is treated as read-only. This is the safe default for READONLY and matches the principle of least authority. Tools that mutate state must opt in explicitly.
Dispatch-stage check. Stage 4 of the α middleware reads capabilitiesFor(ctx.mode) once at registration and binds enforcedMutates = (toolConfig.mutates ?? false) && !caps.canWriteDatabase. If enforcedMutates is true, the wrapped handler returns the ERR_READONLY_MODE envelope before the domain handler is invoked. Stages 1, 2, 3, and 5 still run — the rejection is audited, just like a successful call.
New error code ERR_READONLY_MODE joins the canonical envelope (docs/reference/mcp-tools-phase-0.md error-code list).
Per-tool registrations gain one keyword each. 5 mutating tools opt in. 9 read-only tools either declare mutates: false for clarity or omit the field and inherit the default.

The void capabilitiesFor(mode) call at src/server.ts:229 is removed by the follow-up implementation — the read moves to the registration path.

Why dispatch-stage and not tool-lock?

docs/2-plugin/modes.md:31 originally specified Stage 1 (tool-lock). The original P0.2.1 audit (docs/audits/p0-2-1-mcp-server-audit.md:122) flagged this as a contradiction with middleware.md and s17, both of which treat Stage 1 as a pure mutex. Putting the check in Stage 4 keeps Stage 1 a pure mutex and respects the spec layering: rejected calls still receive correlation IDs, still get audit-entered, and still return through the {ok, data} envelope. ADR-008 explicitly supersedes the modes.md line 31 prose; the implementation task moves the check from Stage 1 to Stage 4 and updates the doc.

Alternatives Considered

Option A — Enforcement via tool config flag (chosen)

Implementation cost. ~50 LOC in src/server.ts + 14 single-keyword annotations across the registration sites. One new error code.

Runtime cost. A single boolean read against a frozen capability record per dispatch. The capability record is a module-scoped singleton, so the read is hot in the inline cache. Negligible.

Test surface. A new mode-enforcement.test.ts suite asserting that each mutating tool is refused in READONLY and MINIMAL, and that each read-only tool is admitted in every mode. Floor: 5 refuse-cases × 2 modes + 9 admit-cases × 4 modes = ~46 explicit assertions. Manageable.

Schema impact. One optional field on ColibriToolConfig. Optional, so the change is backward-compatible: any future tool registration that omits mutates keeps working.

Tool author ergonomics. The declaration sits at the registration site, next to title, description, and inputSchema. A tool author writing a new mutating tool must explicitly write mutates: true — the same place they write the schema. No remote opt-in to remember.

Failure mode. If an author forgets, the default is false (read-only), which means the tool will be admitted in all modes — a permissive failure for read-only tools but a problem for mutating tools that forget the keyword. The mitigation is a lint-level convention (mutating tools are clustered in known files: repository.ts, merkle.ts) and a test that asserts the count of declared mutates: true tools matches expectation. A future hygiene pass can add a tighter contract (e.g. require mutates: true | false to be set explicitly, no default) once the convention is established.

Why this wins. The declaration is at the same site as the schema. A reviewer of a tool registration sees the mutation posture without leaving the file. The check is centralized, not opt-in, so the failure mode is “registered tool runs without enforcement” rather than “registered tool silently bypasses enforcement”. The cost is tiny; the safety story becomes honest.

Option B — Per-tool opt-in via decorator/wrapper

A new helper requireCapability(name: keyof ModeCapabilities, fn) wraps tool handlers voluntarily. Tool authors call it from the handler body or pass the wrapped function to registerColibriTool.

Implementation cost. ~30 LOC for the helper. Per-tool: 1 line at the call site, but it must be added by hand at every mutating tool.

Runtime cost. Same single boolean read as A, plus one indirection through the wrapper.

Test surface. Same matrix as A, but with an additional class of failure (tool author forgot to call the wrapper) that requires a specific test or convention check.

Schema impact. None.

Tool author ergonomics. A tool author must remember to wrap their handler. The opt-in is invisible from the registration call site; a reader of registerColibriTool('task_update', { ... }) does not see the capability requirement unless they read the handler body.

Failure mode. Silent. A typo in the helper name (requireCapabiilty) is caught at import; a missing wrapper call is not caught at all.

Why rejected. Silent failure modes are exactly what Phase 0 has been trying to escape. The whole point of Finding #1 is that the runtime quietly disagrees with the docs. Option B reproduces the same class of bug at a smaller scale: every new mutating tool is one author-forgetfulness from being silently mode-blind.

Option C — Amend ADR-004 and modes.md to mark mode advisory

Document the modes as informational only. server_health reports the mode; tools do not gate on it. Update docs/2-plugin/modes.md to remove the “rejected with ToolNotAdmittedError” prose. Update docs/reference/mcp-tools-phase-0.md to remove the ERR_INVALID_MODE and never-shipped error code. Update colibri-system.md:58 if needed.

Implementation cost. ~50 LOC of doc edits across 4–6 files. No source-code changes.

Runtime cost. Zero.

Test surface. A doc-link audit and a search for the now-stale terms (ToolNotAdmittedError, ERR_INVALID_MODE).

Schema impact. None. The capability matrix in src/modes.ts becomes pure documentation — the void capabilitiesFor(mode) call could be deleted entirely.

Tool author ergonomics. No obligation to declare anything. No safety either.

Failure mode. A user who sets COLIBRI_MODE=READONLY for compliance evidence (e.g. “this CI run only inspected the DB; it did not write to it”) cannot rely on the runtime to enforce that claim. They must rely on test discipline and code review. READONLY becomes a performative label.

Why rejected. Phase 0’s legitimacy axis (ζ Decision Trail, η Proof Store) is built on the principle that the runtime makes verifiable claims about its own behavior. A mode label that is not enforced contradicts that principle. Amending the docs to remove the contract is feasible — the contract is not yet shipped — but it costs the system the right to say “READONLY is safe” without qualification. The audit (Step 1) found no evidence that A or B is unsafe, costly, or ergonomically wrong. C is the lowest-cost option only if you discount the safety story; once the safety story is counted, A is cheaper.

Tradeoff matrix

Axis	A — config flag	B — decorator/wrapper	C — amend docs
Implementation cost	~50 LOC + 14 annotations	~30 LOC + 14 wrapper calls	~50 LOC docs only
Runtime cost	one boolean per dispatch	one boolean + one wrapper	zero
Test surface	mode × tool matrix	matrix + opt-in audit	doc-link audit
Schema impact	one optional field	none	none
Tool author ergonomics	declaration at registration site	opt-in invisible from registration	none
Failure mode if author forgets	safe default (`mutates: false`)	silent bypass of gating	n/a
Auditor / `READONLY` user	guaranteed	guaranteed only if convention held	not guaranteed

Consequences

Positive

Closes Finding #1. READONLY becomes an enforced contract, not a label.
Tool authors declare intent at the registration site. A reviewer sees mutation posture without reading the handler body.
server_health becomes load-bearing. A client can read the mode and skip mutating calls in advance.
The ζ audit chain records denied calls. A READONLY rejection is an audit_events row, available for proof-grade verification.
Phase 1 κ Rule Engine inherits a clean boundary. When κ admission ships, mode-gating becomes one input among many. ADR-008 is the floor, not the ceiling.

Negative

One new error code in the canonical envelope (ERR_READONLY_MODE). Adds a cell to the error code reference table.
One new branch per dispatch. Negligible runtime cost.
Each mutating tool needs a one-keyword annotation. A small per-tool tax.
A test matrix for mode × admission. ~46 explicit cases. Manageable but non-zero.
The default of mutates: false means a forgetful author’s mutating tool will be admitted in all modes. Mitigated by a test that asserts the declared count matches expectation; can be tightened in a later round to mutates: true | false required (no default).

Neutral

The void capabilitiesFor(mode) call at src/server.ts:229 is removed in the follow-up. The function itself stays — its consumer moves from createServer (advisory) to registerColibriTool (load-bearing).
ADR-005 (δ Model Router stubs) is unaffected. The router has no mutation; it returns Claude.
ADR-007 (η session lifecycle, sibling round) is independent. ADR-008 does not constrain ADR-007’s implementation.

Implementation

The follow-up task is a single cohesive PR. Estimate: ~120 LOC code + ~250 LOC tests + ~30 LOC docs.

Source-code deltas

src/server.ts
- Add readonly mutates?: boolean to ColibriToolConfig.
- In registerColibriTool, compute caps = capabilitiesFor(ctx.mode) once and bind enforcedMutates = (toolConfig.mutates ?? false) && !caps.canWriteDatabase.
- In the wrapped handler, if enforcedMutates, short-circuit before stage 4 with a typed envelope:
```
const envelope = {
  ok: false as const,
  error: {
    code: 'ERR_READONLY_MODE',
    message: `tool ${name} cannot run in mode ${ctx.mode}`,
    details: { tool: name, mode: ctx.mode },
  },
};
```
- Remove void capabilitiesFor(mode) from createServer.
- Update the server_ping registration in bootstrap() with mutates: false.
src/tools/health.ts — mutates: false on server_health.
src/tools/merkle.ts — mutates: true on audit_session_start, merkle_finalize; mutates: false on merkle_root.
src/domains/tasks/repository.ts — mutates: true on task_create, task_update; mutates: false on task_get, task_list, task_next_actions.
src/domains/skills/repository.ts — mutates: false on skill_list.
src/domains/trail/repository.ts — mutates: true on thought_record; mutates: false on thought_record_list.
src/domains/trail/verifier.ts — mutates: false on audit_verify_chain.

The 14-tool mutation table

Tool	`mutates`	Reason
`server_ping`	`false`	health probe; no DB
`server_health`	`false`	read-only `SELECT COUNT(*)`
`task_create`	`true`	`INSERT`
`task_get`	`false`	`SELECT`
`task_update`	`true`	`UPDATE` (FSM-routed)
`task_list`	`false`	`SELECT`
`task_next_actions`	`false`	`SELECT`
`skill_list`	`false`	`SELECT` over `.agents/skills/` projections
`thought_record`	`true`	`INSERT` (hash-chained)
`thought_record_list`	`false`	`SELECT`
`audit_verify_chain`	`false`	`SELECT` walk + hash compute
`audit_session_start`	`true`	`INSERT` into `audit_sessions`
`merkle_finalize`	`true`	`INSERT` Merkle root + flag session as finalized
`merkle_root`	`false`	`SELECT`

5 mutating · 9 read-only.

Test deltas

src/__tests__/middleware/mode-enforcement.test.ts — new file. One describe per mode (FULL, READONLY, TEST, MINIMAL). Within each, one it per shipped tool asserting either admission or ERR_READONLY_MODE rejection. The test bootstraps a server with the target mode via createServer({ mode }), registers the full 14-tool surface, and exercises each tool path.
Floor cases (must exist):
- 5 mutating tools × 2 refuse-modes (READONLY, MINIMAL) = 10 explicit ERR_READONLY_MODE assertions
- 9 read-only tools × 4 admit-modes = 36 explicit admit assertions
- MINIMAL admits only server_ping and server_health per modes.md line 25 — this is a separate axis (the mode admits the tool at all). The follow-up task’s implementer decides whether to enforce this in code or leave it to documentation; the floor for ADR-008 is mutation gating, not the MINIMAL-only admission narrowing. (A second ADR could tighten MINIMAL later if needed.)

Doc deltas

docs/2-plugin/modes.md — line 31 updated. The new prose says: “The dispatch stage of the α chain reads capabilitiesFor(ctx.mode) and refuses any tool registered with mutates: true when canWriteDatabase is false. The refusal returns ERR_READONLY_MODE and is recorded in the ζ audit chain.”
docs/reference/mcp-tools-phase-0.md — ERR_READONLY_MODE added to the canonical error-code list near line 988. ERR_INVALID_MODE (a documented but unshipped error from the original draft) is either struck or relabelled as Phase 1+ deferred.

The follow-up task may consolidate the doc edits into the same PR or split them into a separate doc-only PR — implementer’s call.

What ADR-008 does not specify

Whether MINIMAL should reject tools that aren’t server_ping / server_health at admission time. ADR-008 only commits to mutation gating. Tool-set narrowing is a related but separable decision; if needed, a future ADR can extend the model.
Whether canDispatchExternalIO should also gate dispatch (relevant when ν Integrations exposes MCP tools, which is Phase 1+ per ADR-004 §”What was dropped”). ADR-008’s mechanism naturally extends — a mutates field becomes one of several capability tags — but the extension is out of scope here.
Whether TEST mode should require an explicit opt-in for integration tests. canRunIntegrationTests is set on TEST_CAPS but ADR-008 does not introduce a tool that gates on it. Phase 0’s test discipline runs in Jest workers, not via MCP tool dispatch.

These are explicit non-decisions. The follow-up task lands the mutation-gating slice and stops.

Verification

This decision is verified if and only if (the follow-up implementation task lands):

ColibriToolConfig in src/server.ts carries an optional mutates: boolean field.
registerColibriTool reads capabilitiesFor(ctx.mode) and rejects mutating tools when canWriteDatabase is false.
All 5 mutating tools declare mutates: true.
ERR_READONLY_MODE appears in the standard error envelope and the canonical error-code list.
A test (src/__tests__/middleware/mode-enforcement.test.ts) asserts task_update is refused with ERR_READONLY_MODE when COLIBRI_MODE=READONLY.
npm run build && npm run lint && npm test is green with the test count grown by the new suite.
docs/2-plugin/modes.md:31 reads as the new “dispatch-stage check” prose, not the old “tool-lock stage” prose.
docs/reference/mcp-tools-phase-0.md lists ERR_READONLY_MODE and removes (or re-scopes) ERR_INVALID_MODE.

Sigma performs this verification at the close of the round in which the follow-up implementation task lands.

References

Audit: docs/audits/adr-008-mode-enforcement-audit.md
Contract: docs/contracts/adr-008-mode-enforcement-contract.md
Packet: docs/packets/adr-008-mode-enforcement-packet.md
Verification: docs/verification/adr-008-mode-enforcement-verification.md
Code: src/modes.ts:174 (capabilitiesFor), src/server.ts:229 (the void call), src/server.ts:412-437 (Stage 1 mutex), src/tools/health.ts:130-139 (mode in payload).
Docs: docs/2-plugin/modes.md:31, docs/reference/mcp-tools-phase-0.md:930, docs/architecture/decisions/ADR-004-tool-surface.md:68, docs/colibri-system.md:58.
Prior audits: docs/audits/p0-2-1-mcp-server-audit.md:122, docs/contracts/p0-2-1-mcp-server-contract.md:74, docs/audits/r82-f-deploy-boot-audit.md:44.
Sibling ADR (independent): ADR-007 — η Session Lifecycle.

R84 — answers Finding #1 of the whole-system code review. ADR-008 closes the documented-vs-implemented gap on COLIBRI_MODE. Status: Proposed. Implementation lands in a follow-up task per §Implementation.