ADR-008: Mode Enforcement — Capability-Gated Tool Dispatch
Status: Accepted Date: 2026-05-06 Accepted: 2026-05-06 (R84) Round: R84 Supersedes: None Superseded by: None
Implementation lands in a follow-up task. This ADR is the architectural decision; the patch that wires the runtime to it is a separate dispatch in a later round. ADR-008 alone does not change runtime behavior — it commits Colibri to a path.
Context
A whole-system code review surfaced Finding #1: Phase 0 ships a four-mode runtime contract (FULL, READONLY, TEST, MINIMAL) and a fully-typed capability matrix, but no tool dispatch path consults the matrix. The runtime accepts COLIBRI_MODE=READONLY and reports the mode in server_health, yet a READONLY server happily executes task_update, thought_record, and merkle_finalize. A user who sets READONLY to safely inspect a frozen database is silently lied to.
Where the gap lives in code
src/modes.ts:174-185 defines capabilitiesFor(mode) — an exhaustive switch over the closed RuntimeMode union, returning frozen capability records (canWriteDatabase, canDispatchExternalIO, canRunIntegrationTests, canAcceptMCPConnections). The function is correct and ready to consult.
src/server.ts:229 calls it exactly once:
// Capability read — currently advisory (pure-mutex tool-lock per contract §2
// + T0 Q-1 decision). Kept as an explicit reference so the mode → capability
// dependency is compiled-in for future P0.2.3 / P0.4.2 wiring.
void capabilitiesFor(mode);
The return value is void-discarded. The 5-stage middleware chain composed inside registerColibriTool (src/server.ts:279-410) never reads ctx.mode or any capability field. Stage 1 (runWithToolLock, lines 412-437) is a per-tool mutex keyed on tool name only.
Where the gap lives in docs
docs/2-plugin/modes.md:31 reads:
The tool-lock stage of the α chain (stage 1 of
tool-lock → schema-validate → audit-enter → dispatch → audit-exit) consults the active mode. A request whose tool is not in the admitted set is rejected with a typedToolNotAdmittedErrorat the lock stage, before schema validation, dispatch, or auditing run.
docs/reference/mcp-tools-phase-0.md:930 describes server_health as returning the active mode and lists ERR_INVALID_MODE (line 988) as a documented error code; no tool currently emits it.
ADR-004:68 (the server_health row) implies mode is part of the operating contract.
docs/audits/p0-2-1-mcp-server-audit.md:122 and docs/audits/r82-f-deploy-boot-audit.md:44 already flagged the contradiction at audit time, with the original deferral noting capability gating “should happen at tool registration time, in P0.4.2 or P0.2.3”. Neither path was ever taken; capability gating disappeared and was never ADR’d. This ADR addresses that.
The full inventory and code citations are in docs/audits/adr-008-mode-enforcement-audit.md.
Decision
Adopt Option A: enforcement via a per-tool mutates config flag, checked at dispatch.
The mechanism in five points:
- New optional field
readonly mutates?: booleanonColibriToolConfiginsrc/server.ts. Each tool declares its mutation posture at the registration site. - Default is
false— a tool that does not declare itself is treated as read-only. This is the safe default forREADONLYand matches the principle of least authority. Tools that mutate state must opt in explicitly. - Dispatch-stage check. Stage 4 of the α middleware reads
capabilitiesFor(ctx.mode)once at registration and bindsenforcedMutates = (toolConfig.mutates ?? false) && !caps.canWriteDatabase. IfenforcedMutatesis true, the wrapped handler returns theERR_READONLY_MODEenvelope before the domain handler is invoked. Stages 1, 2, 3, and 5 still run — the rejection is audited, just like a successful call. - New error code
ERR_READONLY_MODEjoins the canonical envelope (docs/reference/mcp-tools-phase-0.mderror-code list). - Per-tool registrations gain one keyword each. 5 mutating tools opt in. 9 read-only tools either declare
mutates: falsefor clarity or omit the field and inherit the default.
The void capabilitiesFor(mode) call at src/server.ts:229 is removed by the follow-up implementation — the read moves to the registration path.
Why dispatch-stage and not tool-lock?
docs/2-plugin/modes.md:31 originally specified Stage 1 (tool-lock). The original P0.2.1 audit (docs/audits/p0-2-1-mcp-server-audit.md:122) flagged this as a contradiction with middleware.md and s17, both of which treat Stage 1 as a pure mutex. Putting the check in Stage 4 keeps Stage 1 a pure mutex and respects the spec layering: rejected calls still receive correlation IDs, still get audit-entered, and still return through the {ok, data} envelope. ADR-008 explicitly supersedes the modes.md line 31 prose; the implementation task moves the check from Stage 1 to Stage 4 and updates the doc.
Alternatives Considered
Option A — Enforcement via tool config flag (chosen)
Implementation cost. ~50 LOC in src/server.ts + 14 single-keyword annotations across the registration sites. One new error code.
Runtime cost. A single boolean read against a frozen capability record per dispatch. The capability record is a module-scoped singleton, so the read is hot in the inline cache. Negligible.
Test surface. A new mode-enforcement.test.ts suite asserting that each mutating tool is refused in READONLY and MINIMAL, and that each read-only tool is admitted in every mode. Floor: 5 refuse-cases × 2 modes + 9 admit-cases × 4 modes = ~46 explicit assertions. Manageable.
Schema impact. One optional field on ColibriToolConfig. Optional, so the change is backward-compatible: any future tool registration that omits mutates keeps working.
Tool author ergonomics. The declaration sits at the registration site, next to title, description, and inputSchema. A tool author writing a new mutating tool must explicitly write mutates: true — the same place they write the schema. No remote opt-in to remember.
Failure mode. If an author forgets, the default is false (read-only), which means the tool will be admitted in all modes — a permissive failure for read-only tools but a problem for mutating tools that forget the keyword. The mitigation is a lint-level convention (mutating tools are clustered in known files: repository.ts, merkle.ts) and a test that asserts the count of declared mutates: true tools matches expectation. A future hygiene pass can add a tighter contract (e.g. require mutates: true | false to be set explicitly, no default) once the convention is established.
Why this wins. The declaration is at the same site as the schema. A reviewer of a tool registration sees the mutation posture without leaving the file. The check is centralized, not opt-in, so the failure mode is “registered tool runs without enforcement” rather than “registered tool silently bypasses enforcement”. The cost is tiny; the safety story becomes honest.
Option B — Per-tool opt-in via decorator/wrapper
A new helper requireCapability(name: keyof ModeCapabilities, fn) wraps tool handlers voluntarily. Tool authors call it from the handler body or pass the wrapped function to registerColibriTool.
Implementation cost. ~30 LOC for the helper. Per-tool: 1 line at the call site, but it must be added by hand at every mutating tool.
Runtime cost. Same single boolean read as A, plus one indirection through the wrapper.
Test surface. Same matrix as A, but with an additional class of failure (tool author forgot to call the wrapper) that requires a specific test or convention check.
Schema impact. None.
Tool author ergonomics. A tool author must remember to wrap their handler. The opt-in is invisible from the registration call site; a reader of registerColibriTool('task_update', { ... }) does not see the capability requirement unless they read the handler body.
Failure mode. Silent. A typo in the helper name (requireCapabiilty) is caught at import; a missing wrapper call is not caught at all.
Why rejected. Silent failure modes are exactly what Phase 0 has been trying to escape. The whole point of Finding #1 is that the runtime quietly disagrees with the docs. Option B reproduces the same class of bug at a smaller scale: every new mutating tool is one author-forgetfulness from being silently mode-blind.
Option C — Amend ADR-004 and modes.md to mark mode advisory
Document the modes as informational only. server_health reports the mode; tools do not gate on it. Update docs/2-plugin/modes.md to remove the “rejected with ToolNotAdmittedError” prose. Update docs/reference/mcp-tools-phase-0.md to remove the ERR_INVALID_MODE and never-shipped error code. Update colibri-system.md:58 if needed.
Implementation cost. ~50 LOC of doc edits across 4–6 files. No source-code changes.
Runtime cost. Zero.
Test surface. A doc-link audit and a search for the now-stale terms (ToolNotAdmittedError, ERR_INVALID_MODE).
Schema impact. None. The capability matrix in src/modes.ts becomes pure documentation — the void capabilitiesFor(mode) call could be deleted entirely.
Tool author ergonomics. No obligation to declare anything. No safety either.
Failure mode. A user who sets COLIBRI_MODE=READONLY for compliance evidence (e.g. “this CI run only inspected the DB; it did not write to it”) cannot rely on the runtime to enforce that claim. They must rely on test discipline and code review. READONLY becomes a performative label.
Why rejected. Phase 0’s legitimacy axis (ζ Decision Trail, η Proof Store) is built on the principle that the runtime makes verifiable claims about its own behavior. A mode label that is not enforced contradicts that principle. Amending the docs to remove the contract is feasible — the contract is not yet shipped — but it costs the system the right to say “READONLY is safe” without qualification. The audit (Step 1) found no evidence that A or B is unsafe, costly, or ergonomically wrong. C is the lowest-cost option only if you discount the safety story; once the safety story is counted, A is cheaper.
Tradeoff matrix
| Axis | A — config flag | B — decorator/wrapper | C — amend docs |
|---|---|---|---|
| Implementation cost | ~50 LOC + 14 annotations | ~30 LOC + 14 wrapper calls | ~50 LOC docs only |
| Runtime cost | one boolean per dispatch | one boolean + one wrapper | zero |
| Test surface | mode × tool matrix | matrix + opt-in audit | doc-link audit |
| Schema impact | one optional field | none | none |
| Tool author ergonomics | declaration at registration site | opt-in invisible from registration | none |
| Failure mode if author forgets | safe default (mutates: false) |
silent bypass of gating | n/a |
Auditor / READONLY user |
guaranteed | guaranteed only if convention held | not guaranteed |
Consequences
Positive
- Closes Finding #1.
READONLYbecomes an enforced contract, not a label. - Tool authors declare intent at the registration site. A reviewer sees mutation posture without reading the handler body.
server_healthbecomes load-bearing. A client can read the mode and skip mutating calls in advance.- The ζ audit chain records denied calls. A
READONLYrejection is anaudit_eventsrow, available for proof-grade verification. - Phase 1 κ Rule Engine inherits a clean boundary. When κ admission ships, mode-gating becomes one input among many. ADR-008 is the floor, not the ceiling.
Negative
- One new error code in the canonical envelope (
ERR_READONLY_MODE). Adds a cell to the error code reference table. - One new branch per dispatch. Negligible runtime cost.
- Each mutating tool needs a one-keyword annotation. A small per-tool tax.
- A test matrix for mode × admission. ~46 explicit cases. Manageable but non-zero.
- The default of
mutates: falsemeans a forgetful author’s mutating tool will be admitted in all modes. Mitigated by a test that asserts the declared count matches expectation; can be tightened in a later round tomutates: true | falserequired (no default).
Neutral
- The
void capabilitiesFor(mode)call atsrc/server.ts:229is removed in the follow-up. The function itself stays — its consumer moves fromcreateServer(advisory) toregisterColibriTool(load-bearing). - ADR-005 (δ Model Router stubs) is unaffected. The router has no mutation; it returns Claude.
- ADR-007 (η session lifecycle, sibling round) is independent. ADR-008 does not constrain ADR-007’s implementation.
Implementation
The follow-up task is a single cohesive PR. Estimate: ~120 LOC code + ~250 LOC tests + ~30 LOC docs.
Source-code deltas
src/server.ts- Add
readonly mutates?: booleantoColibriToolConfig. - In
registerColibriTool, computecaps = capabilitiesFor(ctx.mode)once and bindenforcedMutates = (toolConfig.mutates ?? false) && !caps.canWriteDatabase. - In the wrapped handler, if
enforcedMutates, short-circuit before stage 4 with a typed envelope:const envelope = { ok: false as const, error: { code: 'ERR_READONLY_MODE', message: `tool ${name} cannot run in mode ${ctx.mode}`, details: { tool: name, mode: ctx.mode }, }, }; - Remove
void capabilitiesFor(mode)fromcreateServer. - Update the
server_pingregistration inbootstrap()withmutates: false.
- Add
src/tools/health.ts—mutates: falseonserver_health.src/tools/merkle.ts—mutates: trueonaudit_session_start,merkle_finalize;mutates: falseonmerkle_root.src/domains/tasks/repository.ts—mutates: trueontask_create,task_update;mutates: falseontask_get,task_list,task_next_actions.src/domains/skills/repository.ts—mutates: falseonskill_list.src/domains/trail/repository.ts—mutates: trueonthought_record;mutates: falseonthought_record_list.src/domains/trail/verifier.ts—mutates: falseonaudit_verify_chain.
The 14-tool mutation table
| Tool | mutates |
Reason |
|---|---|---|
server_ping |
false |
health probe; no DB |
server_health |
false |
read-only SELECT COUNT(*) |
task_create |
true |
INSERT |
task_get |
false |
SELECT |
task_update |
true |
UPDATE (FSM-routed) |
task_list |
false |
SELECT |
task_next_actions |
false |
SELECT |
skill_list |
false |
SELECT over .agents/skills/ projections |
thought_record |
true |
INSERT (hash-chained) |
thought_record_list |
false |
SELECT |
audit_verify_chain |
false |
SELECT walk + hash compute |
audit_session_start |
true |
INSERT into audit_sessions |
merkle_finalize |
true |
INSERT Merkle root + flag session as finalized |
merkle_root |
false |
SELECT |
5 mutating · 9 read-only.
Test deltas
src/__tests__/middleware/mode-enforcement.test.ts— new file. Onedescribeper mode (FULL,READONLY,TEST,MINIMAL). Within each, oneitper shipped tool asserting either admission orERR_READONLY_MODErejection. The test bootstraps a server with the target mode viacreateServer({ mode }), registers the full 14-tool surface, and exercises each tool path.- Floor cases (must exist):
- 5 mutating tools × 2 refuse-modes (
READONLY,MINIMAL) = 10 explicitERR_READONLY_MODEassertions - 9 read-only tools × 4 admit-modes = 36 explicit admit assertions
MINIMALadmits onlyserver_pingandserver_healthper modes.md line 25 — this is a separate axis (the mode admits the tool at all). The follow-up task’s implementer decides whether to enforce this in code or leave it to documentation; the floor for ADR-008 is mutation gating, not theMINIMAL-only admission narrowing. (A second ADR could tightenMINIMALlater if needed.)
- 5 mutating tools × 2 refuse-modes (
Doc deltas
docs/2-plugin/modes.md— line 31 updated. The new prose says: “The dispatch stage of the α chain readscapabilitiesFor(ctx.mode)and refuses any tool registered withmutates: truewhencanWriteDatabaseis false. The refusal returnsERR_READONLY_MODEand is recorded in the ζ audit chain.”docs/reference/mcp-tools-phase-0.md—ERR_READONLY_MODEadded to the canonical error-code list near line 988.ERR_INVALID_MODE(a documented but unshipped error from the original draft) is either struck or relabelled as Phase 1+ deferred.
The follow-up task may consolidate the doc edits into the same PR or split them into a separate doc-only PR — implementer’s call.
What ADR-008 does not specify
- Whether
MINIMALshould reject tools that aren’tserver_ping/server_healthat admission time. ADR-008 only commits to mutation gating. Tool-set narrowing is a related but separable decision; if needed, a future ADR can extend the model. - Whether
canDispatchExternalIOshould also gate dispatch (relevant when ν Integrations exposes MCP tools, which is Phase 1+ per ADR-004 §”What was dropped”). ADR-008’s mechanism naturally extends — amutatesfield becomes one of several capability tags — but the extension is out of scope here. - Whether
TESTmode should require an explicit opt-in for integration tests.canRunIntegrationTestsis set onTEST_CAPSbut ADR-008 does not introduce a tool that gates on it. Phase 0’s test discipline runs in Jest workers, not via MCP tool dispatch.
These are explicit non-decisions. The follow-up task lands the mutation-gating slice and stops.
Verification
This decision is verified if and only if (the follow-up implementation task lands):
ColibriToolConfiginsrc/server.tscarries an optionalmutates: booleanfield.registerColibriToolreadscapabilitiesFor(ctx.mode)and rejects mutating tools whencanWriteDatabaseis false.- All 5 mutating tools declare
mutates: true. ERR_READONLY_MODEappears in the standard error envelope and the canonical error-code list.- A test (
src/__tests__/middleware/mode-enforcement.test.ts) assertstask_updateis refused withERR_READONLY_MODEwhenCOLIBRI_MODE=READONLY. npm run build && npm run lint && npm testis green with the test count grown by the new suite.docs/2-plugin/modes.md:31reads as the new “dispatch-stage check” prose, not the old “tool-lock stage” prose.docs/reference/mcp-tools-phase-0.mdlistsERR_READONLY_MODEand removes (or re-scopes)ERR_INVALID_MODE.
Sigma performs this verification at the close of the round in which the follow-up implementation task lands.
References
- Audit:
docs/audits/adr-008-mode-enforcement-audit.md - Contract:
docs/contracts/adr-008-mode-enforcement-contract.md - Packet:
docs/packets/adr-008-mode-enforcement-packet.md - Verification:
docs/verification/adr-008-mode-enforcement-verification.md - Code:
src/modes.ts:174(capabilitiesFor),src/server.ts:229(the void call),src/server.ts:412-437(Stage 1 mutex),src/tools/health.ts:130-139(mode in payload). - Docs:
docs/2-plugin/modes.md:31,docs/reference/mcp-tools-phase-0.md:930,docs/architecture/decisions/ADR-004-tool-surface.md:68,docs/colibri-system.md:58. - Prior audits:
docs/audits/p0-2-1-mcp-server-audit.md:122,docs/contracts/p0-2-1-mcp-server-contract.md:74,docs/audits/r82-f-deploy-boot-audit.md:44. - Sibling ADR (independent): ADR-007 — η Session Lifecycle.
R84 — answers Finding #1 of the whole-system code review. ADR-008 closes the documented-vs-implemented gap on COLIBRI_MODE. Status: Proposed. Implementation lands in a follow-up task per §Implementation.