1 — Transport: How Mutations Enter the World

Every change to Colibri’s world enters through the same door: the Model Context Protocol (MCP) over stdio.

MCP is a stateless JSON-RPC 2.0 protocol. A client writes a JSON-RPC request to Colibri’s stdin, the server processes it, and writes a response to stdout. This transport guarantees that every mutation — every task created, every thought recorded, every proof finalized — must be:

Named — mapped to one of exactly 19 MCP tool names
Validated — checked against a strict Zod schema before any handler sees it
Audited — written to the decision trail before business logic runs
Serialized — executed one at a time, never concurrent
Durable — written to the SQLite database or rejected with a structured error

There is no backdoor. No API key grants access to a privileged command. No internal code path bypasses the middleware chain. The 14 shipped tools are the entire surface. This is a closed surface by design (ADR-004 Option C): it is not extensible by plugins in Phase 0, and intentionally so.

What is MCP?

MCP (Model Context Protocol) is a lightweight protocol developed by Anthropic for AI models to invoke tools in external systems. From Colibri’s perspective:

Transport: stdio. Client writes JSON-RPC 2.0 envelopes to stdin; server writes responses to stdout.
Single client: In Phase 0, the client is a Claude agent session with @modelcontextprotocol/sdk. The server is Colibri.
Tool discovery: Client sends initialize handshake; server responds with a tool list. Client then sends tools/call requests.
No authentication: Phase 0 runs under the owner’s user. The process trusts its stdin. Auth arrives in Phase 3+ with π Governance.

The entire Colibri server is stateless with respect to MCP — it does not maintain a session socket or websocket. Each JSON-RPC call is independent; the server does not need to know about previous calls except through the database.

The 19-Tool Closed Surface

Colibri Phase 0 exposes exactly 19 MCP tools. Every tool:

Accepts input validated by a Zod schema
Writes an audit record to the decision trail (ζ) before execution
Runs inside the 5-stage α middleware chain (tool-lock → schema-validate → audit-enter → dispatch → audit-exit)
Returns a structured JSON response with a result or an error envelope
Writes the outcome to audit before returning

The 14 shipped tools are grouped by domain:

Category 1: Task Management (8 tools, concept β)

Task creation, updates, queries, and state transitions — the β Task Pipeline FSM. Shipped surface is 5 tools; the R74.5 plan listed 8 (see ADR-004 R75 Wave H amendment for the 3 deferred/closed tools).

Tool	Purpose
`task_create`	Create a new task in the INIT state
`task_list`	Query tasks with filters and pagination
`task_get`	Retrieve full details of a single task
`task_update`	Partial update; accepts `status` and routes transitions through `src/domains/tasks/state-machine.ts` (no separate `task_transition` tool in Phase 0 — merged during P0.3.4)
`task_next_actions`	Unblocked queue

Category 2: Audit & Proof (6 tools, concepts ζ and η)

Decision trails, Merkle trees, and audit verification — the proof-grade chain. Shipped surface is 6 tools (4 ζ + 2 η); the R74.5 plan listed a 7th tool audit_session_end — not shipped, merkle_finalize is the session-close signal.

Tool	Purpose
`audit_session_start`	Open a proof-grade audit session
`thought_record`	Append a hash-chained decision to the trail
`thought_record_list`	Read the thought chain for a task or session
`audit_verify_chain`	Walk the SHA-256 chain and verify integrity
`merkle_finalize`	Build the Merkle tree after all thoughts are recorded; also closes the session
`merkle_root`	Return the final root hash

Category 3: Skill Registry (1 tool, concept ε)

Read-only discovery of available skills.

Tool	Purpose
`skill_list`	Enumerate canonical `colibri-*` skills from `.agents/skills/`

Category 4: System & Control (2 tools shipped, concepts α and γ)

Server health and capability declaration. (The R74.5 plan also listed server_info and server_shutdown; neither shipped — server_health absorbs the capability report, and shutdown is process-level via SIGTERM/SIGINT. See ADR-004 R75 Wave H amendment.)

Tool	Purpose
`server_ping`	Sub-100ms round-trip to verify the server is alive
`server_health`	Report DB status, middleware readiness, runtime mode, tool count, capability set, uptime

Not in Phase 0: server_info, server_shutdown, agent_spawn, agent_status, skill_get, skill_invoke, skill_register, skill_unregister, and any δ router tool are deferred or struck per ADR-004 R75 Wave H amendment and ADR-005. The full 60–80 tool ceiling is a Phase 1+ target; Phase 0 is intentionally minimal (14 shipped).

Request-Response Flow

Every mutation follows this shape:

Request (from client to server):

{
  "jsonrpc": "2.0",
  "id": "call-12345",
  "method": "tools/call",
  "params": {
    "name": "task_create",
    "arguments": {
      "title": "Implement boot sequence",
      "project": "colibri",
      "priority": "high"
    }
  }
}

The client assembles the envelope using @modelcontextprotocol/sdk’s StdioClientTransport. The tool name and arguments are the only things that vary.

Response (from server to client):

{
  "jsonrpc": "2.0",
  "id": "call-12345",
  "result": {
    "content": [
      {
        "type": "text",
        "text": "{\"task_id\": \"T-0042\", \"status\": \"INIT\", \"created_at\": \"2026-04-10T14:30:00Z\"}"
      }
    ]
  }
}

On error, the response includes an error object instead of result, with structured details about what failed.

The Middleware Chain Boundary

Every tool call crosses the 5-stage α middleware chain before reaching its domain handler. This is the contract:

Tool lock — serialize concurrent calls; only one handler runs at a time.
Schema validate — parse the request against the tool’s Zod schema; reject before the handler sees it.
Audit enter — record the call in the decision trail with a monotonic sequence number.
Dispatch — route the tool name to its handler (task, audit, skill, or system domain).
Audit exit — record the outcome, duration, and result hash.

All five stages run on every call, in order. A tool call that is not audited is not a tool call — it is a bug.

Why Closed?

A closed surface (not extensible by Phase 0 plugins) ensures:

Audit completeness. Every possible mutation is named, numbered, and traceable.
Schema stability. Client code can rely on the exact Zod shapes — no surprise new fields.
Determinism. The complete set of valid operations is enumerable and testable.
Security boundary. The 14 shipped tools are the entire attack surface for Phase 0.

Plugins (skill hot-reload, custom integrations) arrive in Phase 1+ when src/domains/integrations/ and a registration mechanism exist. At that point the surface may grow, but the core will remain stable.

What Happens After

When a tool request arrives at the server, it:

Travels through the 5-stage α middleware chain (this section).
Reaches its domain handler, which reads/writes the SQLite database.
Returns a response back through the chain.
Is recorded in the audit trail before the response is sent.

The next section, 2 — Plugin: The Colibri Server, explains what happens inside the server after the request crosses the transport boundary.