Quick Start — Try Colibri
Status: Phase 0 at 100% on non-deferred tasks (28/28 shipped as of R75 Wave I — 2026-04-18). P0.5.1/P0.5.2 shipped as δ library-only stubs per ADR-005 §Decision (PR #149 scoring, PR #150 fallback); full multi-model routing lands Phase 1.5. The MCP server boots over stdio and registers 14 tools at
a22dd23e. This page describes the live Phase 0 surface.
Five-minute orientation
Colibri = a TypeScript MCP server (stdio) that:
- Executes tasks through the formal 8-state β FSM:
INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE, plusCANCELLEDfrom any state. Enforced in the β state machine (src/domains/tasks/state-machine.ts) —task_updateaccepts astatusfield and routes it through the FSM; illegal transitions are rejected. - Runs on Claude only in Phase 0. The δ model-router ships as library-only stubs in Phase 0 per ADR-005 §Decision — constant scoring (always Claude) + single-member fallback chain (Claude). Full multi-model scoring, N-member fallback, and circuit breaker land in Phase 1.5.
- Records every decision as a hash-chained ζ audit trail — SHA-256 linked
thought_recordsverified byaudit_verify_chain. - Seals work into a Merkle η proof store —
merkle_finalizebuilds the tree,merkle_rootreturns the root hash. - Does not spawn sub-agents via MCP. Sub-agents in Phase 0 are spawned with the Task tool (Claude’s built-in sub-agent dispatch) into
.worktrees/claude/<task-slug>feature worktrees. The donoragent_spawn/agent_status/agent_listtools and the entiresrc/domains/agents/target are deferred to Phase 1.5 per ADR-005.
For whom? Teams running multi-step agentic workflows who need accountability and memory. Not a chatbot. Highly opinionated orchestration runtime.
What ships in Phase 0? 14 MCP tools over stdio (R74.5 planned 19; 5 were closed/struck/deferred during implementation and 1 was added). See the next section.
What Phase 0 delivers — the 14-tool shipped surface
After Phase 0 Waves A–I (28/28 tasks shipped; 100% on non-deferred work), the server exposes these 14 tools across five concept letters. ADR-004 (R74.5 originally planned 19; R75 Wave H amendment reconciles the count to what actually shipped; Wave I did not add tools — δ stubs are library-only).
β Task Pipeline — 5 tools
task_create— Create a task (returnstask_id, initial stateINIT)task_list— List tasks with filters (status,priority,owner,tag) and paginationtask_get— Get a single task by id with full fieldstask_update— Partial update of mutable fields (description,priority,progress,owner,tags). Acceptsstatusand routes transitions through the β state machine (src/domains/tasks/state-machine.ts) — illegal transitions rejected withERR_INVALID_TRANSITION. No separatetask_transitiontool exists in Phase 0 (merged during P0.3.4).task_next_actions— Return unblocked tasks in priority order
ζ Decision Trail — 4 tools (axis closed in Wave G)
audit_session_start— Open a proof-grade session (returnssession_id)thought_record— Append a hash-chained decision row (thought_type: plan | decision | analysis | reflection | …)thought_record_list— Read the thought chain for a sessionaudit_verify_chain— Verify the SHA-256 chain from session start to tip (shipped Wave G, P0.7.3)
η Proof Store — 2 tools (axis complete in Wave F)
merkle_finalize— Build the Merkle tree over the session’s thought records (also serves as the session-close signal — no separateaudit_session_endtool in Phase 0)merkle_root— Return the finalized root hash + metadata
ε Skill Registry — 1 tool in Phase 0
skill_list— List the 23 canonicalcolibri-*skills discovered on disk
skill_get,skill_reload, and the rest of the ε hot-reload surface are deferred to Phase 1. Phase 0 ships a read-only discovery path plus an in-memory capability index (P0.6.3, Wave H — closes the ε axis).
System Health — 2 tools
server_ping— Minimal<100 msstdio round-tripserver_health— returns a 6-field payload (status,version,uptime_ms,db_tables,phase,mode) covering liveness + runtime mode + DB schema coverage. Authoritative description indocs/2-plugin/health.md. Absorbs what the R74.5 plan calledserver_info.
Not in Phase 0 (donor-era, listed so you don’t look for them):
task_transition(merged intotask_update),task_delete,task_depends_on(deferred),audit_session_end(merged intomerkle_finalize),server_info/server_shutdown(phantom tools in the R74.5 plan, never implemented, being struck from docs),agent_spawn,agent_status,agent_list,skill_get,skill_reload,task_create_batch,task_deps,task_eisenhower,task_report,task_critical_path,roadmap_*(12 variants),memory_*(12 variants),context_*(7 variants),analysis_rag_*,thought_plan,thought_decide,merkle_proof,merkle_verify. All deferred; none are registered in Phase 0.
A typical session (what you will do)
Here is the flow a Claude session follows against the live Phase 0 surface (14 tools):
1. server_ping # stdio is live
2. server_health # DB open, middleware registered, tools registered
3. task_next_actions { limit: 5 } # find the next unblocked task
4. audit_session_start { intent: "..." } # open a proof-grade session
5. task_update { task_id, status: "GATHER" } # move the task forward (FSM-enforced)
6. [executor does audit → contract → packet → implement → verify]
7. thought_record { thought_type: "decision", content: "..." }
8. thought_record { thought_type: "analysis", content: "..." }
9. task_update { task_id, progress: 100, status: "DONE" }
10. thought_record { thought_type: "reflection", content: "task_id / branch / commit / tests / summary / blockers" }
11. audit_verify_chain { session_id }
12. merkle_finalize { session_id } # MUST come after the final reflection; also closes the session
13. merkle_root { session_id } # proof of work
Load-bearing ordering rule: the final thought_record { reflection } MUST precede merkle_finalize. Otherwise the reflection is not anchored in the Merkle root. See CLAUDE.md §7 and writeback-protocol.md.
State transition rule: you move tasks via task_update, passing a status field. The state machine (src/domains/tasks/state-machine.ts) enforces legal transitions — illegal jumps (e.g. INIT → DONE) return ERR_INVALID_TRANSITION. The R74.5 plan had a separate task_transition tool; during P0.3.4 implementation the two were merged. The 5-step executor chain (audit → contract → packet → implement → verify, CLAUDE.md §6) maps 1:1 onto the β FSM states GATHER → ANALYZE → PLAN → APPLY → VERIFY.
What makes Colibri different
- Execution is formal. Tasks move through an 8-state FSM enforced in middleware, not free-form strings on a to-do list. Illegal jumps (e.g.
INIT → DONE) are rejected with a 400 at the contract layer. - Decisions are cryptographic. Every
thought_recordis SHA-256 chained to the previous row in the session;audit_verify_chainwalks the chain and fails on any tampering. The finalmerkle_rootis the commitment. - Sub-agents are contract-bound. Phase 0 dispatches sub-agents via the host Task tool (Claude Code’s built-in Agent/Task dispatch) into isolated
.worktrees/claude/<task-slug>worktrees — the MCPagent_spawnfamily is deferred to Phase 1.5 per ADR-005 §Decision. Writeback ownership depends on the dispatch case: a T3 executor dispatched by PM owns its own writeback (task_update { status: "DONE" }— which routes through the β state-machine atsrc/domains/tasks/state-machine.ts— plusthought_record { reflection }), while a leaf helper an executor spawns for bounded research/search does NOT call writeback — its parent writes back on its behalf perwriteback-protocol.mdline 16. And DONE is not convention: the β pipeline hard-blocks the transition atsrc/domains/tasks/writeback.ts:97withERR_WRITEBACK_REQUIREDwhen nothought_recordexists for the task. No silent failures, no ghost work.
Bottom line: agentic work gets memory, proof, and accountability — not just results.
Three paths into the docs
1. “I want to run the server” (engineers)
Read in order:
- Task Breakdown (Phase 0) — start with P0.1 setup
- Task Prompts — per-task copy-paste prompts
- Extractions Index — algorithm reference (pseudocode)
2. “I want to understand the architecture” (architects)
Read in order:
- World Schema — the organizational spine across all 15 Greek concepts
- α System Core (boot) — entry point (execution axis)
- 2 — Plugin index — how pieces fit together
- 5 — Time: round — round → wave → task orchestration
- ADR-004 Tool Surface — why 14 tools (R74.5 originally planned 19; Wave H amendment reconciled the count to shipped reality)
- ADR-005 Multi-Model Router Phase — why δ Phase 0 ships as library stubs and full routing lands Phase 1.5
3. “I want to see examples” (operators/users)
Read:
- Agent Bootstrap — master bootstrap prompt for cold Claude sessions
- Writeback Protocol — the ordering rule, worked examples
- Glossary — every term explained
Next steps
- Read
colibri-system.md— the canonical vision (single source of truth). - Read
CLAUDE.md— the four-tier agent hierarchy, the worktree rule, the writeback protocol. - Pick one of the three paths above and dig into how.
- When Phase 0 starts, open
implementation/task-breakdown.mdand pick a P0.x task. Create a feature worktree (git worktree add .worktrees/claude/<task-slug> -b feature/<task-slug> origin/main) — never edit the main checkout.
FAQs
Q: Can I run this today?
A: Yes — Phase 0 is 100% on non-deferred tasks (28/28). The MCP server boots over stdio at a22dd23e and registers 14 tools. Configure a stdio client (e.g. Claude Desktop, or the .vscode/mcp-settings.example.json) to launch node dist/server.js.
Q: When will Phase 0 be done?
A: It is done. As of R75 Wave I, Phase 0 is 28/28 — P0.6.3 (ε capability index) closed the ε axis in Wave H, and P0.5.1 + P0.5.2 shipped as δ library-only stubs in Wave I per ADR-005 §Decision. Next round opens the Phase 0 seal + Phase 1 planning scope. See implementation/task-breakdown.md.
Q: Do I need TypeScript?
A: Yes. Stack is TypeScript 5.3+ (ESM, NodeNext), @modelcontextprotocol/sdk, better-sqlite3, Zod v3.23, merkletreejs, gray-matter, Jest (ESM). Chevrotain is spec-only for κ (Phase 1+).
Q: What’s the database?
A: SQLite via better-sqlite3. Path: data/colibri.db — created at runtime in WAL mode, single-writer (P0.2.2 shipped). Schema is declared in migrations under src/db/migrations/001_init.sql through 006_eta.sql; src/db/schema.sql is a reference asset (not executed). The legacy data/ams.db is heritage — kept only as the donor task store and writeback target through the Phase 0 bootstrap.
Q: Environment variables?
A: Only the COLIBRI_* namespace is read by Phase 0 code (the AMS_* prefix is rejected). COLIBRI_MODE selects one of the four runtime modes (FULL, READONLY, TEST, MINIMAL). ANTHROPIC_API_KEY (vendor-canonical name) is optional and validated at call-time by the ν Claude API wrappers, not at startup — the server boots cleanly when the key is unset.
Q: Can I extend it?
A: Yes, with caveats. Custom skills are prose playbooks you drop into .agents/skills/ — see ε Skill Registry. Custom MCP tools and domain extensions must wait for Phase 1 — Phase 0 locks the shipped 14-tool surface.
For a deeper tour, start with colibri-system.md, then walk the Greek concepts via world-schema.md.