Execution: How Work Flows

Execution is the core loop: tasks arrive, are processed, leave traces, and are sealed into proofs. This section describes how work enters the system, transforms through a pipeline, and exits with an auditable trail.

The Core Loop

A task’s journey has four stages:

  1. Arrival (β INIT) — Task enters over an MCP tool call (task_new, task_transition, etc.). The α middleware chain validates it.

  2. Processing (β GATHER → ANALYZE → PLAN → APPLY → VERIFY) — The task moves through the FSM. At each state, a human executor or an AI agent does work, updates the task record, and calls task_transition to advance to the next state. Each transition is itself a tool call, so each transition is audited and hashed.

  3. Recording (ζ Decision Trail) — As work happens, the executor records decisions via thought_record. Each record is hash-chained to the previous one, forming an immutable chain. The hash of the final record is included in the task’s completion record.

  4. Sealing (η Proof Store) — When the task reaches state DONE (or CANCELLED), merkle_finalize is called to seal the entire task trace into the Merkle tree. The final Merkle root is published and can later be proven to include the task.

The entire flow is provable: given the final Merkle root, you can prove that this task existed, when it happened, who worked on it, what decisions were made, and in what order.

β — Task Pipeline: The 8-State FSM

The task pipeline is a finite state machine with 8 states. Every task passes through them in order (or is cancelled).

INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE
                                                    ↓
                                              (or CANCELLED)
State Who What Outcome
INIT System Task created, assigned ID, stored in the database. Ready for human assignment.
GATHER Executor Collect context: requirements, acceptance criteria, references, dependencies. A comprehensive brief ready for analysis.
ANALYZE Executor Break down the problem. Identify risks, unknowns, blockers. Write the analysis audit. A clear statement of what needs to happen.
PLAN Executor Design the solution. Write the execution plan (packet). Define the steps. Get sign-off before implementation. A detailed, approved implementation plan.
APPLY Executor Execute the plan. Build the thing, write the code, create the artifacts. Commit work to git. Working implementation in the feature branch.
VERIFY Executor Test, review, document. Write tests or run manual checks. Confirm that acceptance criteria are met. Verified, tested, ready to merge.
DONE System Task completed. Executor submits a final reflection via thought_record. merkle_finalize is called. Task sealed into the Merkle tree. Proof generated.
CANCELLED Any Task was abandoned, superseded, or marked as out-of-scope. A reason must be recorded. Task is closed without completion.

Each state is associated with an artifact:

  • AUDIT (after ANALYZE) — docs/audits/<name>-audit.md
  • CONTRACT (before PLAN) — docs/contracts/<name>-contract.md
  • PACKET (after PLAN, before APPLY) — docs/packets/<name>-packet.md
  • IMPLEMENTATION (after APPLY) — Source files, committed to git
  • VERIFICATION (after VERIFY) — docs/verification/<name>-verification.md

The executor moves between states by calling task_transition { to: "NEXT_STATE" }. The system rejects invalid transitions (e.g., you cannot jump from INIT to VERIFY; you must go through GATHER, ANALYZE, PLAN, APPLY first).

Phase 0 Task Pipeline

β is specified and implemented in Phase 0 (src/domains/tasks/). The core state machine, the tool suite (task_new, task_transition, task_update, task_depends_on), and the writeback contract are all in Phase 0.

The only thing not in Phase 0 is task pooling and load balancing — a scalability feature deferred to Phase 2+.

ε — Skill Registry: What the System Can Do

Phase: Partial in Phase 0

What it does: Defines the vocabulary of capabilities. Skills are reusable playbooks — documented procedures that an executor can follow (or that an AI model can invoke).

There are 23 canonical skills documented in .agents/skills/:

  • colibri-alpha-executor — Run the executor 5-step chain
  • colibri-beta-triage — Analyze a task, write audit
  • colibri-beta-packet — Write an execution plan
  • colibri-gamma-deploy — Deploy a change
  • colibri-zeta-reflection — Write a thought record
  • … and 17 more

In Phase 0, the skill_list tool exposes the full registry. An executor or AI agent can call skill_list to see what’s available. The executor then follows the skill playbook as a human guide, or the AI agent reads the skill spec and self-executes it.

What Phase 0 does:

  • skill_list tool — query the registry

What Phase 1 defers:

  • skill_get — retrieve a single skill’s full definition
  • skill_reload — hot-reload a skill without restarting the server
  • agent_spawn — programmatically invoke a skill via a sub-agent (deferred to Phase 1.5 per ADR-005)

The skill registry is the bridge between what the system can do and who is doing it. Every executor and every AI model consults the skill registry to understand the available options.

ζ — Decision Trail: Auditable Reasoning

Phase: 0

What it does: Forms a tamper-evident record of every decision, thought, and reasoning step. Decisions are hash-chained: each new record includes the hash of the previous one, so any modification to an earlier record breaks the chain and is immediately detectable.

An executor records a decision via thought_record { thought_type, content }. The server:

  1. Hashes the content using SHA-256.
  2. Looks up the previous thought record in the decision trail.
  3. Chains them: current_hash = SHA256(previous_hash + current_content)
  4. Stores the record with timestamp, executor ID, and thought type.
  5. Returns the chain hash to the caller.

At the end of a session, the final thought hash is included in the merkle_finalize call, and the entire decision trail is sealed into the Merkle tree.

Thought types:

  • reflection — An executor’s end-of-task summary (required before task completion).
  • decision — A reasoning step or choice made during work.
  • note — A memo or observation (non-decision).
  • blocker — A risk or obstacle encountered.
  • correction — An error found and fixed.

ζ is fully specified and implemented in Phase 0 (src/domains/trail/; the directory is named trail, not thought).

How the 5-Step Executor Chain Maps onto the β FSM

When a human executor works on a task, they follow a 5-step chain documented in 5-time/round.md:

  1. Audit → Write an audit document, understand the task. (β ANALYZE state)
  2. Contract → Write a behavioral contract, define success. (β PLAN state, pre-approval)
  3. Packet → Write an execution plan, break into steps. (β PLAN state, post-approval)
  4. Implement → Execute the plan, build the thing. (β APPLY state)
  5. Verify → Test, confirm criteria, document. (β VERIFY state)

Each step is a commit. By step 5, the β FSM has advanced through GATHER → ANALYZE → PLAN → APPLY → VERIFY, and the task is ready for completion.

Phase 0 enforcement: The tooling guides executors through this chain via the 23 canonical skills. A skill like colibri-alpha-executor tells the executor “here’s the 5-step process.” The executor follows it manually (in Phase 0, humans are T3 executors).

In Phase 1.5 (when δ Model Router and agent_spawn arrive), AI models will be routed to tasks and will self-execute the skills. But the 5-step structure remains the same.

η — Proof Store: Sealing the Trail

Phase: 0

What it does: When a task reaches DONE or CANCELLED, the executor or system calls merkle_finalize to add the task record to the Merkle tree. The tree grows as tasks complete. After a session (or a time window), merkle_root returns the final root hash, which cryptographically summarizes everything that happened.

Later phases (Phase 2+) add:

  • Consistency proofs: Prove that the tree only grew, never shrank.
  • Inclusion proofs: Prove that a specific task is in the tree without revealing the entire tree.
  • Compression: Reduce the proof size from O(log n) to O(1) using zk-SNARKs (deferred to Phase 2).

In Phase 0, the proof store is a simple Merkle tree stored in SQLite. It is sufficient to prove what happened, but not yet compressed or parallelized.

Integrations: Connecting to the Outside World

ν — Integrations (Phase 0.9+)

Execution happens inside Colibri, but the artifacts live in the outside world:

  • Repo: Source code is committed to git. Phase 0 has no auto-commit, but skills document the git workflow.
  • Obsidian: Audit, contract, packet, and verification documents are written as .md files and stored in the Obsidian vault mirror. Phase 0 does not auto-sync; the executor or Sigma orchestrator manages sync.
  • GitHub: PRs, issues, and releases are created and tracked. Phase 0 does not auto-create; skills document the manual workflow.
  • Stdio: All tool calls and results flow over stdio (the MCP transport). This is the canonical interface.

ν is partially in Phase 0 (stdio transport) and deferred to Phase 0.9 for repo/Obsidian/GitHub integration.

From Task to Proof: An Example

Scenario: Executor picks up task P0.2.1 (“Write src/server.ts”). Here’s the execution flow:

1. Executor calls task_transition { to: "GATHER" }
   → β state = GATHER
   → System records the transition in ζ decision trail
   
2. Executor gathers requirements, writes an audit doc
   → Executor calls thought_record { thought_type: "decision", content: "Task requires TypeScript, ESM..." }
   → ζ chains the record: hash_2 = SHA256(hash_1 + content)
   
3. Executor calls task_transition { to: "ANALYZE" }
   → β state = ANALYZE
   → Executor saves audit doc to docs/audits/server-audit.md
   
4. Executor calls task_transition { to: "PLAN" }
   → β state = PLAN
   → Executor writes contract doc, then packet doc
   → More thought_record calls, each one chained
   
5. Executor calls task_transition { to: "APPLY" }
   → β state = APPLY
   → Executor writes src/server.ts, commits to git
   → Calls task_update { progress: 100 }
   
6. Executor calls task_transition { to: "VERIFY" }
   → β state = VERIFY
   → Executor writes verification doc, runs tests
   → Calls task_transition { to: "DONE" }
   → β state = DONE
   
7. Executor calls thought_record { thought_type: "reflection", content: "Task complete, server boots..." }
   → ζ chains the final reflection
   
8. System calls merkle_finalize { task_id: "P0.2.1", decision_trail_hash: "0xabc...", root: "..." }
   → η Proof Store adds the task to the Merkle tree
   → Returns the new Merkle root hash
   
9. Executor calls merkle_root
   → Returns final root: "0x12345..."
   → This root proves everything that happened up to this point
   → Given the root, any later verifier can reconstruct the task's path

This is what “provable work” means: every transition is recorded, every decision is chained, and the final result is sealed into a tamper-evident tree.

Key Reads


Table of contents


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.