Colibri — Deep Task Breakdown for Agent Execution

Purpose: Fine-grained task definitions for AI agent execution (all axes). Source: docs/colibri-system.md, concept docs (α–π), donor extractions (heritage-only). Stack: TypeScript 5.3+ · @modelcontextprotocol/sdk · Zod 4 · better-sqlite3 · Jest (ESM).

Quick Start for agents: This file describes what to do. For how to do it (copy-paste ready prompts), see task-prompts/ — each Phase 0 group (P0.1 through P0.9) has a corresponding prompt file. For the dependency DAG and critical path, see task-dependency-graph.md. For the round-by-round schedule (R74 → R80), see ../../5-time/roadmap.md.


Phase 0 shape

Phase 0 is 28 sub-tasks numbered P0.1.1 – P0.9.3. The numbering is locked; no slice may introduce new sub-tasks into this range. Numbering:

Group Sub-tasks Concept
P0.1 P0.1.1 – P0.1.4 Project infrastructure
P0.2 P0.2.1 – P0.2.4 α System Core
P0.3 P0.3.1 – P0.3.4 β Task Pipeline
P0.4 P0.4.1 – P0.4.2 γ Server Lifecycle
P0.5 P0.5.1 – P0.5.2 δ Model Router (Phase 0 stubs; full routing → Phase 1.5)
P0.6 P0.6.1 – P0.6.3 ε Skill Registry
P0.7 P0.7.1 – P0.7.3 ζ Decision Trail
P0.8 P0.8.1 – P0.8.3 η Proof Store
P0.9 P0.9.1 – P0.9.3 ν Integrations

Each sub-task has an acceptance criteria list, an effort estimate, and explicit dependencies. Every sub-task lists target files — none of them exist yet. colibri_code: none holds for all 15 Greek concepts.

Heritage note: Donor algorithm pseudocode for every concept is in docs/reference/extractions/. Those files are references, not copy sources. Colibri is a full rewrite; code is earned by the 5-step executor chain (audit → contract → packet → implement → verify), not transcribed.


Phase 0: Colibri Bootstrap (Execution + Intelligence Axis)

These tasks implement the Execution and Intelligence axes from scratch (TypeScript rewrite):

P0.1 — Project Infrastructure

P0.1.1 — Package Setup

  • Depends on: nothing
  • Output: package.json, tsconfig.json, .eslintrc.json, .prettierrc, .env.example
  • Acceptance criteria:
    • package.json: "type": "module", "engines": {"node": ">=20"}, ESM-first
    • TypeScript 5.3+: strict: true, target: ES2022, module: NodeNext
    • tsx for dev; tsc for production build
    • .env.example documents the Phase 0 COLIBRI_* floor (COLIBRI_DB_PATH, COLIBRI_LOG_LEVEL, COLIBRI_STARTUP_TIMEOUT_MS). Additional variables are earned by later concepts; no AMS_* variables.
    • .gitignore excludes node_modules/, dist/, .env, data/colibri.db, data/ams.db
  • Effort: S

P0.1.2 — Test Runner + Linter

  • Depends on: P0.1.1
  • Input: nothing
  • Output: jest.config.ts, eslint.config.ts, src/__tests__/smoke.test.ts
  • Acceptance criteria:
    • npm test runs Jest with ESM transform
    • npm run lint runs ESLint with zero errors on empty codebase
    • npm run build compiles TypeScript to dist/ with no errors
    • Smoke test: smoke.test.ts asserts 1 + 1 === 2 (verifies test harness works)
    • Code coverage report generated (--coverage flag)
  • Effort: S

P0.1.3 — CI Pipeline

  • Depends on: P0.1.2
  • Input: .github/workflows/ci.yml (existing; may need update)
  • Output: .github/workflows/ci.yml updated for TypeScript
  • Acceptance criteria:
    • Runs on push to any branch and on PR to main
    • Steps: npm cinpm run lintnpm testnpm run build
    • Node.js 20+ matrix
    • Fails if any step fails
    • Uploads coverage report artifact
  • Effort: S

P0.1.4 — Environment Validation

  • Depends on: P0.1.1
  • Output: src/config.ts, tests/config.test.ts
  • Acceptance criteria:
    • Zod schema validates the Phase 0 COLIBRI_* floor on startup
    • Missing required var → throws with human-readable message listing the missing key
    • Optional vars have typed defaults (COLIBRI_LOG_LEVEL=info, COLIBRI_STARTUP_TIMEOUT_MS=30000)
    • COLIBRI_LOG_LEVEL accepted values: silent | error | warn | info | debug
    • NODE_ENV accepted values: development | test | production
    • Export config object (typed, not raw process.env)
    • Reading any AMS_* variable is a lint/test failure — the donor namespace is not supported
  • Effort: S

P0.2 — α System Core

P0.2.1 — MCP Server Bootstrap

  • Depends on: P0.1.2
  • Output: src/server.ts, tests/server.test.ts
  • Acceptance criteria:
    • McpServer created with name: "colibri", version from package.json
    • StdioServerTransport is the only transport in Phase 0 per S17. No HTTP, no WebSocket.
    • Server exports registerTool(name, schema, handler) helper that composes the five-stage α middleware chain (tool-lock → schema validate → audit enter → dispatch → audit exit)
    • At least 1 registered tool: server/ping → returns { status: "ok", version }
    • npm test passes with MCP handshake integration test
  • Effort: M

P0.2.2 — SQLite Initialization

  • Depends on: P0.1.4
  • Input: docs/architecture/data-model.md §2 (earning rule), docs/reference/extractions/alpha-system-core-extraction.md (donor pseudocode, heritage-only)
  • Output: src/db/index.ts, src/db/schema.sql, tests/db/init.test.ts
  • Acceptance criteria:
    • Uses better-sqlite3 (sync API)
    • schema.sql ships only an empty header + the first migration slot. Tables are added by their owning concept’s P0 sub-task per docs/architecture/data-model.md §2 (β: tasks, task_transitions; ε: skills; ζ: thoughts, actions; η: merkle_nodes, merkle_roots; ν: sync_log). No “78 tables” target.
    • initDb(path) function: creates DB if not exists, applies all numbered migrations in order, returns Database instance
    • Idempotent: calling initDb() twice does not fail or duplicate data
    • WAL mode enabled: PRAGMA journal_mode=WAL
    • Foreign keys enabled: PRAGMA foreign_keys=ON
    • PRAGMA integrity_check runs at startup and fails boot on any error
    • Test: fresh DB passes integrity check
  • Effort: L

P0.2.3 — Two-Phase Startup

  • Depends on: P0.2.1, P0.2.2
  • Input: alpha-system-core-extraction.md (startup section), gamma-server-lifecycle-extraction.md
  • Output: src/startup.ts, tests/startup.test.ts
  • Acceptance criteria:
    • Phase 1 (transport): MCP transport ready, health check responds, DB not yet loaded
    • Phase 2 (heavy init): DB initialized, all tools registered, all domains loaded
    • startup() returns only after Phase 2 completes
    • If Phase 2 fails, server shuts down gracefully (no hanging process)
    • Startup time logged: console.error("Startup complete in {ms}ms")
    • Test: mock Phase 2 failure → verify clean shutdown
  • Effort: M

P0.2.4 — Health Check Tool

  • Depends on: P0.2.3
  • Input: nothing
  • Output: src/tools/health.ts, tests/tools/health.test.ts
  • Acceptance criteria:
    • Tool name: server/health
    • Returns: { status, version, uptime_ms, db_tables, phase, mode }
    • db_tables: count of SQLite tables (verifies schema loaded correctly)
    • [ ] phase: "phase1" "phase2"
    • mode: current runtime mode string
    • Response time < 100ms
  • Effort: S

P0.3 — β Task Pipeline

P0.3.1 — β Task Pipeline State Machine

  • Depends on: P0.2.2
  • Input: docs/colibri-system.md §6.3 (canonical FSM), docs/concepts/β-task-pipeline.md
  • Output: src/domains/tasks/state-machine.ts, tests/domains/tasks/state-machine.test.ts
  • Acceptance criteria:
    • 7 states defined exactly as in colibri-system.md §6.3: INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE, with CANCELLED as a terminal side-branch reachable from any non-terminal state.
    • Transition map matches the canonical diagram. Unlisted transitions throw InvalidTransitionError with {from, to, taskId}.
    • transition(task, newState) → returns updated task or throws
    • canTransition(from, to) → boolean (no side effects)
    • DONE and CANCELLED are terminal; any transition out of them throws
    • 100% branch coverage (all valid transitions, all invalid transitions, both terminal exits)
  • Effort: S

Heritage note: The AMS donor task store used a kanban-style lifecycle (backlog | todo | in_progress | blocked | review | done | cancelled). That vocabulary survives at the PM-facing level (see CLAUDE.md §5 — “Only todo tasks are executable”) while data/ams.db remains the task store during Phase 0 bootstrap. The β execution FSM inside Colibri is the canonical INIT..DONE pipeline above, not the donor lifecycle. Mapping between the two belongs to ν Integrations, not β.

P0.3.2 — Task CRUD

  • Depends on: P0.3.1
  • Input: beta-task-pipeline-extraction.md (CRUD section)
  • Output: src/domains/tasks/repository.ts, tests/domains/tasks/repository.test.ts
  • Acceptance criteria:
    • createTask(input): inserts into tasks table, returns task with generated id (UUID v4)
    • getTask(id): returns task or null
    • updateTask(id, patch): partial update, returns updated task
    • deleteTask(id): soft delete (sets deleted_at)
    • listTasks({ status?, project_id?, limit?, offset? }): filtered + paginated
    • All operations use better-sqlite3 prepared statements (no string interpolation)
    • Test: CRUD roundtrip with all fields
  • Effort: M

P0.3.3 — Writeback Contract Enforcement

  • Depends on: P0.3.2
  • Input: beta-task-pipeline-extraction.md (writeback section)
  • Output: src/domains/tasks/writeback.ts, tests/domains/tasks/writeback.test.ts
  • Acceptance criteria:
    • writebackRequired(taskId): returns true if task is done but lacks thought_record
    • enforceWriteback(taskId): throws WritebackRequiredError if writeback not complete
    • Runtime blocking: any tool that moves task to done MUST call enforceWriteback before returning
    • WritebackRequiredError includes taskId, missing_fields[]
    • Test: marking task done without thought_record → error; with thought_record → success
  • Effort: S

P0.3.4 — Task Tools (MCP surface)

  • Depends on: P0.3.3, P0.2.1
  • Input: beta-task-pipeline-extraction.md (tools section)
  • Output: src/tools/tasks.ts, tests/tools/tasks.test.ts
  • Acceptance criteria:
    • task_create tool: Zod input schema, calls createTask, returns task
    • task_get tool: returns task or { error: "not_found" }
    • task_update tool: partial update; validates new state via FSM
    • task_list tool: supports status, limit, offset filters
    • task_next_actions tool: returns list of unblocked todo tasks sorted by priority
    • task_update with status: "done" triggers writeback enforcement
    • All tools have Zod input validation (invalid input → structured error, not crash)
  • Effort: M

P0.4 — γ Server Lifecycle

P0.4.1 — Runtime Mode Enum

  • Depends on: P0.1.4
  • Input: docs/concepts/γ-server-lifecycle.md, docs/reference/extractions/gamma-server-lifecycle-extraction.md (heritage-only)
  • Output: src/modes.ts, tests/modes.test.ts
  • Acceptance criteria:
    • 4 modes in Phase 0: FULL | READONLY | TEST | MINIMAL. (The donor WATCH mode is excluded — file watching is not in Phase 0 α scope.)
    • detectMode(): reads COLIBRI_MODE env var, defaults to FULL
    • Each mode has a capability set: { canWrite, canRunTests, heavyInit }
    • READONLY: canWrite=false, all write tools return { ok: false, error: { code: "ERR_READONLY" } }
    • MINIMAL: no heavy init; only unified_vitals is registered
    • Test: all 4 modes have correct capability sets
    • No AMS_MODE fallback — donor namespace is not supported
  • Effort: S

P0.4.2 — Graceful Shutdown

  • Depends on: P0.2.3
  • Input: gamma-server-lifecycle-extraction.md (shutdown section)
  • Output: src/shutdown.ts, tests/shutdown.test.ts
  • Acceptance criteria:
    • registerShutdownHandler(fn): registers a cleanup function
    • On SIGINT / SIGTERM: calls all handlers in reverse registration order
    • DB connection closed before process exit
    • In-flight MCP requests allowed to complete (max 5s timeout then force-exit)
    • Exit code 0 on clean shutdown, 1 on error during shutdown
    • Test: mock SIGTERM → verify DB close + handler called
  • Effort: S

P0.5 — δ Model Router

Phase 0 note: Phase 0 library stubs shipped in R75 Wave I per ADR-005 §Decision. The router interface is present (scoring + fallback modules, single-row candidate table); scoring returns a constant (claude: 1.0), fallback has one member (Claude), adapter is Anthropic-only. Full multi-model scoring, N-member fallback, and circuit breaker land in Phase 1.5. No δ-facing MCP tools in the Phase 0 14-tool surface — router_* tools are deferred to Phase 1.5.

P0.5.1 — Intent Scoring Matrix

  • Status: Phase 0 stub shipped (R75 Wave I, PR #149)
  • Input: docs/reference/extractions/delta-model-router-extraction.md (heritage-only — algorithm source)
  • Output: src/domains/router/scoring.ts, src/__tests__/domains/router/scoring.test.ts
  • Phase 0 shipped: src/domains/router/scoring.ts returns a constant vector ({ claude: 1.0 }) for every input. The module signature matches the Phase 1.5 target so Phase 1.5 is a formula replacement, not an interface rewrite.
  • Design contract (Phase 1.5):
    • scoreIntent(prompt, context){ scores: Record<ModelId, number>, winner: ModelId }Phase 1.5: real scoring factors
    • Scoring factors: prompt length, complexity keywords, context size, tool requirements — Phase 1.5
    • All scores in range [0, 100] (integer) — Phase 1.5
    • Deterministic: same input always returns same winner (Phase 0 stub is trivially deterministic — always returns Claude)
    • Pure function (no external API calls) — invariant from Phase 0
  • Effort: M

P0.5.2 — Model Fallback Chain

  • Status: Phase 0 stub shipped (R75 Wave I, PR #150)
  • Input: docs/reference/extractions/delta-model-router-extraction.md (heritage-only — algorithm source)
  • Output: src/domains/router/fallback.ts, src/__tests__/domains/router/fallback.test.ts
  • Phase 0 shipped: src/domains/router/fallback.ts implements a single-member chain — if Claude fails, the call fails; no cascade. Matches ADR-005 §Decision (“fallback chain has one member”).
  • Design contract (Phase 1.5):
    • Model slots configured via COLIBRI_MODEL_* env vars (count TBD by Phase 1.5) — Phase 1.5
    • routeRequest(prompt, context): tries models in priority order — Phase 1.5 multi-member
    • On model error / timeout: tries next model in chain — Phase 1.5
    • On exhaustion: throws AllModelsFailedError with per-model error log — Phase 1.5 (Phase 0 raises ModelUnavailable on single-member exhaustion)
    • Circuit breaker: model marked unavailable for 60s after 3 consecutive failures — Phase 1.5
    • No AMS_MODEL_* fallback — donor namespace is not supported (invariant from Phase 0)
  • Effort: M

P0.6 — ε Skill Registry

P0.6.1 — Skill Schema

  • Depends on: P0.2.2
  • Input: epsilon-skill-registry-extraction.md
  • Output: src/domains/skills/schema.ts, tests/domains/skills/schema.test.ts
  • Acceptance criteria:
    • Zod schema: { name, description, version, entrypoint, capabilities[], greekLetter? }
    • name must be kebab-case: /^[a-z][a-z0-9-]+$/
    • capabilities enum: ["read", "write", "spawn", "audit", "admin"]
    • greekLetter optional: must be one of α β γ δ ε ζ η θ ι κ λ μ ν ξ π
    • SKILL.md parser: reads frontmatter + body from existing .agents/skills/*/SKILL.md
    • Test: parse all 22 existing skill files, assert zero schema errors
  • Effort: M

P0.6.2 — Skill CRUD + Discovery

  • Depends on: P0.6.1
  • Input: docs/concepts/ε-skill-registry.md, docs/reference/extractions/epsilon-skill-registry-extraction.md (heritage-only)
  • Output: src/domains/skills/repository.ts, tests/domains/skills/repository.test.ts
  • Acceptance criteria:
    • On startup: scans .agents/skills/*/SKILL.md, parses frontmatter, loads all valid skills into the skills table
    • getSkill(name) → skill or null
    • listSkills({ search?, capability? }) → filtered list
    • skill_list MCP tool (the only ε Phase 0 MCP tool): returns all loaded skills with frontmatter metadata — see S17 §1 Category 3.
    • skill_get, skill_reload, hot-reload — not in Phase 0. Deferred to Phase 1.
  • Effort: M

P0.6.3 — Skill Capability Index

  • Depends on: P0.6.2
  • Input: docs/concepts/ε-skill-registry.md
  • Output: src/domains/skills/capabilities.ts, tests/domains/skills/capabilities.test.ts
  • Acceptance criteria:
    • listByCapability(capability) → skills that declare the capability in frontmatter
    • Capability strings are treated as opaque tags (Phase 0 does not enforce an enum)
    • Startup warning if any SKILL.md declares a capability not used anywhere else (drift detector)
    • Test: seed 3 fake skills with overlapping capabilities; verify filter
  • Effort: S

Heritage note: The donor ε module supported skill_get, skill_reload, and a spawnAgent sub-process helper. None of these are in Phase 0. There is no src/domains/agents/ directory in the Phase 0 target tree (CLAUDE.md §9.1); agent spawning is deferred to Phase 1.5 with δ Model Router per ADR-005. Phase 0 ε ships the SKILL.md parser and one MCP tool (skill_list).


P0.7 — ζ Decision Trail

P0.7.1 — Hash-Chained Record Schema

  • Depends on: P0.2.2
  • Input: zeta-decision-trail-extraction.md
  • Output: src/domains/trail/schema.ts, tests/domains/trail/schema.test.ts
  • Acceptance criteria:
    • Record schema: { id, type, task_id, agent_id, content, timestamp, prev_hash, hash }
    • 4 valid types: plan | analysis | decision | reflection
    • hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))
    • Canonical JSON: sorted keys, no whitespace (deterministic)
    • First record: prev_hash = "0000...0000" (64 zeros)
    • Test: two records with identical inputs produce identical hashes
  • Effort: S

P0.7.2 — Thought Record CRUD

  • Depends on: P0.7.1
  • Input: zeta-decision-trail-extraction.md (CRUD section)
  • Output: src/domains/trail/repository.ts, tests/domains/trail/repository.test.ts
  • Acceptance criteria:
    • createThoughtRecord(input): computes hash, links to previous record’s hash
    • getThoughtRecord(id): returns record with hash
    • listThoughtRecords({ task_id?, limit? }): returns chain in insertion order
    • thought_record MCP tool: Zod input, returns record with computed hash
    • thought_record_list MCP tool: returns chain for given task_id
  • Effort: M

P0.7.3 — Chain Verification Tool

  • Depends on: P0.7.2
  • Input: zeta-decision-trail-extraction.md (verification section)
  • Output: src/domains/trail/verifier.ts, tests/domains/trail/verifier.test.ts
  • Acceptance criteria:
    • verifyChain(records[]): iterates chain, recomputes each hash, checks links
    • Returns: { valid: bool, first_broken_at?: id, broken_count: number }
    • audit_verify_chain MCP tool: calls verifyChain on full DB chain
    • Test: tamper with one record’s content → verify valid: false at correct position
    • Test: intact 100-record chain → valid: true in < 500ms
  • Effort: S

P0.8 — η Proof Store

P0.8.1 — Merkle Tree Construction

  • Depends on: P0.7.2
  • Output: src/domains/proof/merkle.ts, tests/domains/proof/merkle.test.ts
  • Acceptance criteria:
    • Uses merkletreejs package (SHA-256 leaves)
    • buildMerkleTree(recordHashes[]){ root, tree }
    • generateProof(tree, leafHash) → proof array
    • verifyProof(root, proof, leafHash) → boolean
    • Empty tree: root = SHA-256(“”) (defined constant)
    • Test: 10-leaf tree → root is deterministic; membership proof verifies correctly
  • Effort: S

P0.8.2 — Three-Zone Retention

  • Depends on: P0.8.1, P0.2.2
  • Input: eta-proof-store-extraction.md (retention section)
  • Output: src/domains/proof/retention.ts, tests/domains/proof/retention.test.ts
  • Acceptance criteria:
    • Hot zone: last 100 records — full content in DB
    • Warm zone: records 101–1000 — content compressed (JSON → gzip → base64)
    • Cold zone: records 1001+ — content hash only (full content deleted)
    • archiveRecord(id): moves record to next zone based on age/position
    • retrieveRecord(id): decompresses if Warm, returns hash stub if Cold
    • Test: hot → warm → cold transitions; verify content availability per zone
  • Effort: M

P0.8.3 — Merkle Root Finalization Tool

  • Depends on: P0.8.1
  • Input: eta-proof-store-extraction.md (finalization section)
  • Output: src/tools/merkle.ts, tests/tools/merkle.test.ts
  • Acceptance criteria:
    • merkle_finalize MCP tool: builds Merkle tree of last N unfinalized records, stores root
    • merkle_root MCP tool: returns current root hash + record count + timestamp
    • audit_session_start MCP tool: creates audit session record, returns session_id
    • Finalization must happen AFTER final thought record (enforced: errors if no thought_record in session)
    • Test: finalize 5-record session → root matches manual computation
  • Effort: S

P0.9 — ν Integrations

P0.9.1 — MCP Bridge

  • Depends on: P0.2.1
  • Input: nu-integrations-extraction.md (bridge section)
  • Output: src/domains/integrations/mcp-bridge.ts, tests/domains/integrations/mcp-bridge.test.ts
  • Acceptance criteria:
    • McpBridge: wraps outbound MCP client calls to external servers
    • connectToServer(url): creates client, returns connected bridge
    • callTool(bridge, name, args): calls remote tool, returns result
    • Timeout: 30s default, configurable via COLIBRI_MCP_TIMEOUT
    • Retry: 3 attempts with exponential backoff on transient errors
    • Test: mock MCP server → verify roundtrip tool call
  • Effort: M

P0.9.2 — Claude API Wrappers

  • Depends on: P0.1.4
  • Input: nu-integrations-extraction.md (Claude API section)
  • Output: src/domains/integrations/claude.ts, tests/domains/integrations/claude.test.ts
  • Acceptance criteria:
    • createCompletion(prompt, options): calls Anthropic API with configured model
    • createCompletionWithTools(prompt, tools, options): tool-use completion
    • API key from ANTHROPIC_API_KEY env var (declared .optional() in the Zod schema; validated at call-time by createCompletion / createCompletionWithTools, which throw AnthropicConfigError if absent. This is Design Invariant 5 — the server boots cleanly when the key is unset for deployments that don’t use the Claude API integration. R75 Wave H reconciled this acceptance criterion with the shipped code in src/config.ts:79 + src/domains/integrations/claude.ts.)
    • Rate limit handling: 429 → exponential backoff, max 3 retries
    • All API calls logged with: model, prompt_tokens, completion_tokens, latency_ms
    • Test (mock): verify retry logic and logging
  • Effort: M

P0.9.3 — Notification Channels

  • Depends on: P0.2.1
  • Input: nu-integrations-extraction.md (notifications section)
  • Output: src/domains/integrations/notifications.ts, tests/domains/integrations/notifications.test.ts
  • Acceptance criteria:
    • notify(event, payload): dispatches event to configured channels
    • Channels: log (always on), mcp (MCP notification), webhook (optional)
    • COLIBRI_WEBHOOK_URL env var enables webhook channel (Phase 0 uses the COLIBRI_* namespace; AMS_* is not read)
    • Events: task.completed, merkle.finalized, error.critical (no agent.spawned — agent runtime is deferred per ADR-005)
    • Fire-and-forget: notification failures do not block main execution
    • Test: verify each channel receives correct payload for each event type
  • Effort: S

How to Read This Document

Each task has:

  • ID: P{phase}.{subtask}.{step} (e.g., P1.2.3)
  • Depends on: which tasks must be complete first
  • Input: what the agent reads before starting
  • Output: exact files to create or modify
  • Acceptance criteria: testable conditions (pass/fail)
  • Estimated effort: S (1-2h), M (4-8h), L (1-2d), XL (3-5d)

Phase 1: κ Rule Engine

Phase 1 starts at R81 per docs/5-time/roadmap.md. Ready-to-paste agent prompts for every sub-task below live in task-prompts/p1.1-kappa-rule-engine.md (shipped R76.P1, 2026-04-18).

Structural overview: 5 groups × 20 sub-tasks. All output paths target src/domains/rules/... (Phase 0 convention). Concept reference: docs/3-world/physics/laws/rule-engine.md. Algorithm extraction: docs/reference/extractions/kappa-rule-engine-extraction.md.

P1.1 — Integer Math Library (3 sub-tasks)

P1.1.1 — Basis Point Arithmetic

  • Depends on: nothing
  • Input: docs/3-world/physics/laws/rule-engine.md (Integer-only arithmetic section), docs/reference/extractions/kappa-rule-engine-extraction.md §3–4
  • Output: src/domains/rules/integer-math.ts, src/domains/rules/__tests__/integer-math.test.ts
  • Acceptance criteria:
    • All arithmetic uses 64-bit signed integers, no floating point anywhere
    • bps_mul(value, bps)(value * bps) / 10000 (floor division)
    • bps_div(value, bps)(value * 10000) / bps (floor division)
    • apply_bps(value, bps)value - bps_mul(value, bps) (decay variant)
    • decay(value, rate_bps, epochs) → multi-epoch compounded decay with per-step floor
    • Overflow detection: reject inputs where value * bps would exceed 2^63 - 1
    • Underflow: result never goes below 0 for non-negative inputs
    • Division by zero: explicit error, not silent wrap
    • 100% branch coverage in tests
  • Effort: S

P1.1.2 — Determinism Verification Harness

  • Depends on: P1.1.1
  • Input: rule-engine.md (Forbidden operations section)
  • Output: src/domains/rules/__tests__/determinism.test.ts
  • Acceptance criteria:
    • Property test: for any two runs with identical inputs, outputs are bit-identical
    • No Math.random(), Date.now(), process.hrtime(), or equivalent
    • No async I/O in computation path
    • Fuzz test: 10,000 random input pairs produce identical results under both call orderings
    • Static analysis: grep check rejects any use of Math.* or Date.* outside tests
  • Effort: S

P1.1.3 — BPS Constants + Overflow Protection

  • Depends on: P1.1.1
  • Input: rule-engine.md (basis-point conventions), extraction §3 (BPS Constants) + §4 (Overflow Protection)
  • Output: src/domains/rules/bps-constants.ts, src/domains/rules/__tests__/bps-constants.test.ts
  • Acceptance criteria:
    • Exported constants: BPS_100_PERCENT=10000, BPS_50_PERCENT=5000, BPS_1_PERCENT=100
    • Domain decay rates: DECAY_EXECUTION=500, DECAY_COMMISSIONING=300, DECAY_ARBITRATION=1000, DECAY_GOVERNANCE=200, DECAY_SOCIAL=100
    • Penalty constants: DAMAGE_MINOR=1500, DAMAGE_MODERATE=3000, DAMAGE_SEVERE=5000, DAMAGE_CRITICAL=8000, DAMAGE_FRAUD=10000
    • safe_mul(a, b) returns typed overflow error when |a| > MAX_INT64 / |b|
    • safe_div(a, b) returns typed divide-by-zero error when b == 0
    • Constants are as const, not let or mutable
  • Effort: S

P1.2 — DSL Parser (4 sub-tasks)

P1.2.1 — Lexer / Tokenizer

  • Depends on: nothing
  • Input: rule-engine.md (DSL grammar section), extraction §1 (Full EBNF Grammar), ADR-006-dsl-grammar.md
  • Output: src/domains/rules/lexer.ts, src/domains/rules/__tests__/lexer.test.ts
  • Acceptance criteria:
    • Uses Chevrotain library (pinned in package.json per ADR-006)
    • Token types: KEYWORD, IDENTIFIER, INTEGER, STRING, OPERATOR, DELIMITER, EOF
    • Keywords: rule, guards, effects, when, then, if, else, and, or, not, true, false, admit, reject, admission, transition, consequence, promotion
    • Operators: ==, !=, >, <, >=, <=, +, -, *, /, %
    • Variables: start with $, dot-path dereference $actor.reputation.execution
    • Line/column tracking for error messages
    • Rejects floating-point literals (e.g., 3.14 is a syntax error)
    • Rejects underscore-separated integer literals (1_000_000 invalid)
    • Unicode identifiers supported (for future i18n)
  • Effort: M

P1.2.2 — Parser (Tokens → AST)

  • Depends on: P1.2.1
  • Input: rule-engine.md (DSL grammar), extraction §1 (EBNF) + §2 (AST Node Types), ADR-006
  • Output: src/domains/rules/parser.ts, src/domains/rules/__tests__/parser.test.ts
  • Acceptance criteria:
    • Chevrotain parser built on top of P1.2.1 lexer
    • Parses the 4 rule types: Admission, StateTransition, Consequence, Promotion
    • AST node types match extraction §2: RuleNode, GuardClause, EffectCall, BinaryOp, UnaryOp, LogicalOp, IntLiteral, BoolLiteral, StringLiteral, VarRef, FuncCall
    • Operator precedence: NOT > AND > OR; *///% > +/-; comparison > logical
    • Guard blocks: guards { <clauses> }; each clause (Expression | else) -> (admit | reject STRING)
    • Effect blocks: effects { <calls> }; each call IDENTIFIER ( ArgList )
    • Error recovery: recoveryEnabled: true; reports first 5 errors, doesn’t crash on malformed input
    • Round-trip test: parse → serialize → parse produces identical AST
    • AST cap enforcement: rejects any single rule with > 10,000 AST nodes at parse time
  • Effort: L

P1.2.3 — AST Validator

  • Depends on: P1.2.2
  • Input: rule-engine.md (Forbidden operations table)
  • Output: src/domains/rules/validator.ts, src/domains/rules/__tests__/validator.test.ts
  • Acceptance criteria:
    • Rejects rules that read local state (clock, filesystem, network, process)
    • Rejects rules with randomness (except VRF-input references — $vrf_output)
    • Rejects rules with side effects (HTTP calls, file writes, stdout)
    • Rejects rules that mutate input events
    • Type checking: operands compatible with operators (no int + string)
    • Scope checking: variables defined before use
    • Cycle detection: no infinite recursion in rule references
    • Axiom pre-check: rejects rules that violate AX-01–AX-07 at load time (constitutional axioms)
  • Effort: M

P1.2.4 — Rule Loader / Registry

  • Depends on: P1.2.3
  • Input: rule-engine.md (Rule application algorithm), extraction §8 (process_action)
  • Output: src/domains/rules/registry.ts, src/domains/rules/__tests__/registry.test.ts
  • Acceptance criteria:
    • loadRuleset(source: string): RuleRegistry — parses + validates + indexes a source file
    • Registry sorts rules by specificity: (a) guard term count descending, (b) declaration order
    • Specificity ties at load time → explicit AmbiguousRulesetError — refuse boot
    • getRule(name: string): RuleNode | null — named lookup
    • getByTransitionType(type: TransitionType): RuleNode[] — indexed lookup by one of the 13 transition types from extraction §7
    • computeVersionHash(): string — delegates to P1.5.1 canonical serializer
    • Load-time error aggregation: reports all validator errors in one pass, not just first
  • Effort: M

P1.3 — Deterministic Interpreter (4 sub-tasks)

P1.3.1 — Core Evaluation Loop

  • Depends on: P1.2.2, P1.1.1
  • Input: rule-engine.md (Rule application algorithm, Evaluation budget), extraction §5 (Rule Execution Flow)
  • Output: src/domains/rules/engine.ts, src/domains/rules/__tests__/engine.test.ts
  • Acceptance criteria:
    • Evaluates AST nodes recursively with an immutable context
    • Rule execution order: Admission → StateTransition → Consequence → Promotion (fixed)
    • Within each category: alphabetical by rule name (stable ordering)
    • First-match-wins: once a guard matches, remaining guards in the same rule are skipped
    • Context contains: event, current_state (read-only snapshot), rule_version, epoch, actor binding
    • Returns: list of {type, target, field, old_value, new_value} mutations
    • No mutations applied during evaluation (collect-then-apply pattern)
    • Timeout: MAX_INTEGER_OPS=10_000 — abort with RuleBudgetExceeded("integer_ops")
    • Depth cap: MAX_CALL_DEPTH=16 — abort with RuleBudgetExceeded("call_depth")
    • Arg cap: MAX_ARG_COUNT=8 — abort with RuleBudgetExceeded("arg_count")
  • Effort: L

P1.3.2 — Built-in Functions

  • Depends on: P1.3.1, P1.1.1
  • Input: rule-engine.md (Built-in functions table), extraction §3 (8 Built-in Functions)
  • Output: src/domains/rules/builtins.ts, src/domains/rules/__tests__/builtins.test.ts
  • Acceptance criteria:
    • min(a, b), max(a, b), abs(a), cap(v, m) — integer only
    • clamp(v, lo, hi)max(lo, min(v, hi))
    • isqrt(n) — Newton’s method integer square root (from extraction §3 pseudocode)
    • ilog2(n) — integer floor of log base 2
    • decay(v, rate_bps) — single-epoch decay; delegates to integer-math library
    • diminishing(v, k)(v * k) / (k + v) diminishing-returns transform
    • bps_mul(v, b), bps_div(v, b) — delegate to P1.1.1
    • hash(data) — SHA-256 hex string
    • vrf_verify(pk, proof, input) — VRF proof verification per ADR-002
    • All functions are pure (no side effects, same input = same output)
    • Each function counts as 1 or more integer ops against the evaluation budget (documented table)
  • Effort: M

P1.3.3 — State Access Layer

  • Depends on: P1.3.1
  • Input: rule-engine.md (State Access Pattern), extraction §10 (ReadOnlyState Interface)
  • Output: src/domains/rules/state-access.ts, src/domains/rules/__tests__/state-access.test.ts
  • Acceptance criteria:
    • Read-only state snapshot provided to rules (frozen object or copy-on-write proxy)
    • State keys: reputation[node][domain], tokens[node], stake[node], epoch, event_count, fork_id, rule_version
    • with_binding(name, value) returns new context; original unchanged
    • No direct database access from rules — state is pre-loaded by the host
    • State diff output: {key, old_value, new_value} for each mutation
    • Merkle proof generation for state reads (verifiable by other nodes) — hooks into η
    • Mutation attempts throw ReadOnlyStateError immediately (fail-fast)
  • Effort: M

P1.3.4 — Policy Gating / Pre-guards

  • Depends on: P1.3.1
  • Input: extraction §9 (Policy Gating: check_policy)
  • Output: src/domains/rules/policy-gate.ts, src/domains/rules/__tests__/policy-gate.test.ts
  • Acceptance criteria:
    • Policy enum P1–P13 per extraction §9
    • Each policy is a pure DSL expression (reuses P1.2.2 parser + P1.3.1 evaluator)
    • check_policy(id, actor, context){admitted, reason?}
    • check_all_policies(action, actor, context) → short-circuits on first failure
    • Policies run BEFORE named rule evaluation (pre-guards)
    • Policies share evaluation budget with named rules (same 10k op cap)
    • Each policy has rejection reason pre-registered (no dynamic strings)
  • Effort: M

P1.4 — Admission Layer (4 sub-tasks)

P1.4.1 — Admission Evaluator

  • Depends on: P1.3.1, P1.3.4, P1.2.4
  • Input: rule-engine.md (Admission layer), docs/spec/s10-admission.md
  • Output: src/domains/rules/admission.ts, src/domains/rules/__tests__/admission.test.ts
  • Acceptance criteria:
    • evaluateAdmission({caller, tool, mode, rep_snapshot, rule_version}): AdmissionResult
    • Returns {admitted: true, effect_mutations: [...]} | {admitted: false, reason: DenialReason}
    • Runs policy pre-guards first (P1.3.4), then named rules (P1.3.1)
    • Rule version stamp on every admission record
    • Pure function — no DB writes, no network calls
    • Timing independent of input values (constant-time comparison for sensitive fields)
    • Integration test: ≥20 representative (caller, tool, mode) tuples with expected verdicts
  • Effort: L

P1.4.2 — Denial Reason Taxonomy

  • Depends on: P1.4.1
  • Input: rule-engine.md (Rule application algorithm return values), extraction §5 (Rule Execution Flow)
  • Output: src/domains/rules/denial-reasons.ts, src/domains/rules/__tests__/denial-reasons.test.ts
  • Acceptance criteria:
    • Typed discriminated union: no_rule_matched, budget:integer_ops, budget:call_depth, budget:arg_count, effect_invariant_violated, axiom_violation:AX-01..AX-07, policy:P1..P13, rule_version_mismatch, ambiguous_ruleset
    • Each reason carries a structured details payload (no freeform strings)
    • Reason codes stable across upgrades (additive-only changes; no renumbering)
    • toString(reason) produces operator-readable rendering
    • JSON serialization preserves discriminant tag
  • Effort: S

P1.4.3 — Admission Budgets

  • Depends on: P1.3.1
  • Input: rule-engine.md (Evaluation budget, Default budget constants)
  • Output: src/domains/rules/budget.ts, src/domains/rules/__tests__/budget.test.ts
  • Acceptance criteria:
    • Budget tracker class with counters: integer_ops, call_depth, current_arg_count
    • Limits: MAX_INTEGER_OPS=10_000, MAX_CALL_DEPTH=16, MAX_ARG_COUNT=8 (constants from rule-engine.md)
    • On exceed: throw RuleBudgetExceeded with which-counter-fired field
    • Instrumentation hooks: emit budget.tick events for α’s audit layer (count only, no payload)
    • Budget state is reset per-rule (no leaking across rules in a ruleset)
    • Limits part of the rule version hash (P1.5.1) — changing them forces a new version
  • Effort: M

P1.4.4 — Tool-Lock Integration Spec

  • Depends on: P1.4.1, P1.4.2, P1.4.3
  • Input: rule-engine.md (Admission layer), docs/2-plugin/middleware.md (5-stage wrapper at α), docs/spec/s10-admission.md
  • Output: src/domains/rules/tool-lock-adapter.ts, src/domains/rules/__tests__/tool-lock-adapter.test.ts
  • Acceptance criteria:
    • createToolLockAdapter(ruleRegistry): MiddlewareStage — factory
    • Output is a stage-1 middleware function signature matching α’s 5-stage wrapper contract (tool-lock → schema-validate → audit-enter → dispatch → audit-exit)
    • Admission denials short-circuit the middleware chain (stages 2–5 skipped)
    • Denials emit structured event to audit layer before returning
    • Integration test: wire a test ruleset into a test server; verify admission decisions end-to-end
    • Zero registration with server boot in R76 — this sub-task lands the adapter; α’s src/server.ts wiring is a separate R81+ PR
  • Effort: M

P1.5 — Governance / Rule Versioning (5 sub-tasks)

P1.5.1 — Version Hash Computation

  • Depends on: P1.2.2, P1.5.4
  • Input: rule-engine.md (Rule versioning section)
  • Output: src/domains/rules/versioning.ts, src/domains/rules/__tests__/versioning.test.ts
  • Acceptance criteria:
    • computeVersionHash(ruleset, engine_version): string — returns hex SHA-256
    • Hash input: canonical_serialization(all_rules) || engine_version
    • Canonical serialization via P1.5.4 (sorted-key deterministic JSON)
    • Version stored in event metadata: {rule_version: "sha256:abc..."}
    • Version mismatch detection: events with wrong rule_version are rejected via P1.4.2 taxonomy code rule_version_mismatch
    • Test: two logically-equivalent but differently-ordered rulesets produce identical hash (canonical property)
  • Effort: S

P1.5.2 — Rule Migration

  • Depends on: P1.5.1, P1.3.1, P1.5.5
  • Input: rule-engine.md (Test corpus parity requirement), docs/3-world/physics/enforcement/governance.md (π versioning section, when landed)
  • Output: src/domains/rules/migration.ts, src/domains/rules/__tests__/migration.test.ts
  • Acceptance criteria:
    • migrateRuleset(old, new, corpus): MigrationResult — runs parity harness (P1.5.5)
    • Test corpus: ≥100 representative events
    • Activation epoch: new rules take effect at epoch N+1 (not immediately)
    • Parity requirement: h_old == h_new for every corpus event both versions admit
    • Divergence set must match proposal’s declared scope or migration is rejected
    • Rollback: if parity fails, old ruleset remains active and proposal is marked rejected:parity
    • Fork trigger: nodes that reject migration automatically fork (link to ι — deferred to Phase 5 wiring)
  • Effort: L

P1.5.3 — Activation Epoch + Rollback

  • Depends on: P1.5.1, P1.5.2
  • Input: rule-engine.md (Rule versioning), docs/spec/s11-rule-engine.md
  • Output: src/domains/rules/activation.ts, src/domains/rules/__tests__/activation.test.ts
  • Acceptance criteria:
    • scheduleActivation(new_version, target_epoch): ActivationToken — target_epoch must be current_epoch + 1 minimum
    • applyActivation(token, current_epoch): void — applies only when current_epoch >= target_epoch
    • rollback(version) — reinstates prior version; emits rollback event
    • Activation journal: append-only log of (epoch, version_hash, cause) tuples
    • Rollback does not retroactively invalidate events admitted under rolled-back version — those stand
    • Rollback during dispute window triggers π governance review hook (hook name only — π not implemented)
  • Effort: M

P1.5.4 — Canonical Serialization

  • Depends on: P1.2.2
  • Input: rule-engine.md (Rule versioning: “canonical serialization of the rule bodies”)
  • Output: src/domains/rules/canonical.ts, src/domains/rules/__tests__/canonical.test.ts
  • Acceptance criteria:
    • canonicalize(ast_or_ruleset): string — produces byte-identical output on any platform
    • Keys sorted alphabetically at every object level
    • No whitespace (single-line JSON)
    • Integer literals preserved exactly (no 1e3 normalization, no leading zeros)
    • String escapes use canonical JSON form (\", \n, \u00XX)
    • Property test: canonicalize(parse(canonicalize(parse(x)))) == canonicalize(parse(x)) — idempotent round-trip
    • No locale dependence (sort uses codepoint order, not locale-aware collation)
  • Effort: M

P1.5.5 — Test Corpus Parity Harness

  • Depends on: P1.3.1, P1.5.1
  • Input: rule-engine.md (Test corpus parity requirement)
  • Output: src/domains/rules/parity-harness.ts, src/domains/rules/__tests__/parity-harness.test.ts
  • Acceptance criteria:
    • runParity({old_ruleset, new_ruleset, corpus}): ParityReport
    • Per event: compute effect-set hash h = SHA-256(canonical(effects)) under both versions
    • Report categorizes events: both_admit_same, both_admit_diverge, old_admit_new_reject, old_reject_new_admit, both_reject
    • Pass condition: both_admit_diverge set is empty AND (old_admit_new_rejectold_reject_new_admit) ⊆ declared scope
    • Default corpus of ≥100 hand-curated events shipped with the harness
    • Deterministic: identical inputs → identical report bytes
    • Performance: runs 10k corpus events in < 5 seconds (for CI feedback speed)
  • Effort: L

Phase 2: λ Reputation

P2.1 — Domain Structure

P2.1.1 — Reputation Record Schema

  • Depends on: P1.1.1
  • Input: docs/concepts/λ-reputation.md, docs/guides/implementation/lambda-reputation.md
  • Output: src/domains/reputation/schema.{ext}, database migration
  • Acceptance criteria:
    • 5 domains: execution, commissioning, arbitration, governance, social
    • Per record: node_id, domain, score (integer bps 0-10000), scars (bitmask), ban_until_epoch, last_activity_epoch
    • History table: node_id, domain, epoch, delta, reason, event_id
    • Indexes on (node_id, domain) and (domain, score DESC)
  • Effort: S

P2.1.2 — Score Computation

  • Depends on: P2.1.1, P1.3.1
  • Input: λ-reputation.md (Computation section)
  • Output: src/domains/reputation/compute.{ext}, tests/domains/reputation/compute.test.{ext}
  • Acceptance criteria:
    • compute_score(node_id, domain, events[]) → integer score
    • Score = Σ(acknowledgement_weight × event_outcome) for all events in domain
    • Uses integer-math library for all arithmetic
    • Score capped at 10000 bps (100%) minus scar penalties
    • Property test: score is monotonically non-decreasing with only positive events
  • Effort: M

P2.2 — Decay and Penalties

P2.2.1 — Exponential Decay

  • Depends on: P2.1.1, P1.1.1
  • Input: λ-reputation.md (Decay section), implementation guide
  • Output: src/domains/reputation/decay.{ext}, tests/domains/reputation/decay.test.{ext}
  • Acceptance criteria:
    • Decay applied per-epoch for inactive nodes
    • Rate per domain: execution=500bps, commissioning=300bps, arbitration=1000bps, governance=200bps, social=100bps
    • Formula: new_score = score - apply_bps(score, decay_rate) per inactive epoch
    • Activity in domain resets that domain’s decay counter
    • Batch processing: efficient for 10,000+ nodes per epoch
    • Floor: score cannot go below 0
  • Effort: M

P2.2.2 — Offense Penalties

  • Depends on: P2.1.1, P1.1.1
  • Input: λ-reputation.md (Offense Penalties section)
  • Output: src/domains/reputation/penalties.{ext}, tests/domains/reputation/penalties.test.{ext}
  • Acceptance criteria:
    • Penalty table: minor=1500bps, moderate=3000bps, severe=5000bps, critical=8000bps, fraud=10000bps
    • Scar mechanism: fraud adds permanent cap reduction (score can never exceed 10000 - scar_bps)
    • Ban mechanism: critical+ offense bans from arbitration for N epochs
    • Double jeopardy protection: same event cannot trigger same penalty twice
    • Recovery path: after ban expires, node starts at scar-limited maximum
  • Effort: M

P2.3 — Experience Tokens

P2.3.1 — Token Levels and Minting

  • Depends on: P2.1.1
  • Input: λ-reputation.md (Experience Tokens section), implementation guide
  • Output: src/domains/reputation/tokens.{ext}, tests/domains/reputation/tokens.test.{ext}
  • Acceptance criteria:
    • 5 levels: L0 (Raw), L1 (Episode), L1.5 (Witness), L2a (Correlation), L2b (Proto-causal)
    • L0: auto-minted on event completion
    • L1: requires interaction cycle (commit → deliver → confirm)
    • L1.5: requires witnessing 3+ disputes as observer
    • L2a: requires 5+ repetitions of same interaction pattern
    • L2b: requires L2a + context diversity (3+ categories) + path diversity (3+ counterparties)
    • Tokens are non-transferable, bound to node identity
    • Token count per node queryable by domain
  • Effort: L

P2.4 — Derived Limits

P2.4.1 — Capability Gates

  • Depends on: P2.1.2, P1.3.2
  • Input: λ-reputation.md (Derived Limits section)
  • Output: src/domains/reputation/limits.{ext}, tests/domains/reputation/limits.test.{ext}
  • Acceptance criteria:
    • max_parallel_tasks(rep) = min(sqrt_floor(rep), 20)
    • rate_limit_bonus(rep) = base_rate * log2_floor(max(rep, 1))
    • stake_discount(rep) = required_stake * 10000 / max(rep, 1000) (bps math)
    • can_arbitrate(rep) = rep.arbitration >= 5000 AND rep.execution >= 3000
    • can_govern(rep) = rep.governance >= 4000
    • All computations use integer-math library
  • Effort: M

P2.5 — MCP Tool Surface

P2.5.1 — Reputation Query Tools

  • Depends on: P2.1.2, P2.2.1, P2.2.2, P2.3.1, P2.4.1, P0.3.4
  • Input: λ-reputation.md, docs/guides/implementation/lambda-reputation.md
  • Output: src/domains/reputation/tools.{ext}, tests/domains/reputation/tools.test.{ext}
  • Acceptance criteria:
    • reputation_get(node_id, domain?) → score, scars, ban_until, last_activity per domain
    • reputation_history(node_id, domain, limit) → paginated history events
    • reputation_leaderboard(domain, limit) → top N nodes by score
    • reputation_check_gates(node_id) → capability gate results (can_arbitrate, can_govern, max_parallel_tasks)
    • All tools registered as MCP tools via tool registry (ε)
    • Integration test: create node → apply events → verify score matches hand-calculation
  • Effort: S

Phase 3: θ Consensus

P3.1 — BFT Voting

P3.1.1 — Vote Message Types

  • Depends on: P1.4.1
  • Input: docs/concepts/θ-consensus.md, docs/guides/implementation/theta-consensus.md
  • Output: src/consensus/messages.{ext}, tests/consensus/messages.test.{ext}
  • Acceptance criteria:
    • Message types: PROPOSE, VOTE, COMMIT, VIEW_CHANGE, CHECKPOINT
    • Vote types: ACCEPT, REJECT, ABSTAIN
    • All messages signed with Ed25519
    • Message fields: sender, type, round, payload, signature, timestamp
    • Serialization: canonical JSON (deterministic key order)
    • Deserialization validates all required fields
  • Effort: M

P3.1.2 — Quorum Computation

  • Depends on: P3.1.1
  • Input: θ-consensus.md (quorum math), BFT extraction
  • Output: src/consensus/quorum.{ext}, tests/consensus/quorum.test.{ext}
  • Acceptance criteria:
    • quorum_threshold(n) = floor(2 * n / 3) + 1
    • max_faulty(n) = floor((n - 1) / 3)
    • has_quorum(votes, n) = count(votes.accept) >= quorum_threshold(n)
    • Equivocation detection: same node signs contradicting votes → generate proof
    • Proof format: {node_id, vote_1, vote_2, round} — cryptographic evidence
    • Property test: for n >= 4, quorum of honest nodes always overlaps
  • Effort: M

P3.1.3 — View Change Protocol

  • Depends on: P3.1.2
  • Input: θ-consensus.md (view change section), theta extraction
  • Output: src/consensus/view-change.{ext}, tests/consensus/view-change.test.{ext}
  • Acceptance criteria:
    • Trigger: primary unresponsive for 2× expected round duration
    • New primary selection: deterministic rotation (round % n)
    • View change message carries highest committed state
    • New primary must prove it has the latest committed state
    • Timeout doubles each failed view change (exponential backoff)
    • Anti-thrashing: minimum 3 rounds before another view change
  • Effort: L

P3.2 — Finality Levels

P3.2.1 — Finality State Machine

  • Depends on: P3.1.2
  • Input: θ-consensus.md (Finality Levels), implementation guide
  • Output: src/consensus/finality.{ext}, tests/consensus/finality.test.{ext}
  • Acceptance criteria:
    • States: PENDING → SOFT → QUORUM → HARD → ABSOLUTE
    • PENDING → SOFT: first vote received
    • SOFT → QUORUM: votes >= quorum_threshold(n)
    • QUORUM → HARD: dispute window (100 epochs) elapsed without challenge
    • HARD → ABSOLUTE: appeal window elapsed, fully irreversible
    • No external side effects (payments, exports) before HARD
    • State transitions are monotonic: cannot go backward
    • Each transition recorded with epoch and evidence
  • Effort: L

P3.3 — Gossip Protocol

P3.3.1 — IHAVE/IWANT Messages

  • Depends on: P3.1.1
  • Input: θ-consensus.md (Gossip), implementation guide, theta extraction
  • Output: src/consensus/gossip.{ext}, tests/consensus/gossip.test.{ext}
  • Acceptance criteria:
    • IHAVE: {event_ids[], state_root, rule_version, fork_id}
    • IWANT: {event_ids[]} — request specific events
    • Bloom filter for deduplication (false positive rate < 1%)
    • Adaptive fanout: well-connected nodes gossip to fewer peers
    • Triple-Anchor validation: reject messages where rule_version, state_root, or fork_id don’t match
    • Bandwidth budget: max N bytes/second per peer connection
  • Effort: L

P3.4 — Time Anchors

P3.4.1 — Signed Timestamps

  • Depends on: P3.1.1, P2.1.1
  • Input: θ-consensus.md (Time Anchors), implementation guide
  • Output: src/consensus/time-anchors.{ext}, tests/consensus/time-anchors.test.{ext}
  • Acceptance criteria:
    • Eligible publishers: top N arbiters by arbitration reputation
    • Anchor format: {publisher, timestamp_ms, epoch, signature}
    • Median computation: collect anchors from last K epochs, take median
    • Drift detection: |local_clock - median| > 30_000ms → deprioritize proposals
    • Monotonicity: anchors from same publisher must be non-decreasing
    • Replay protection: anchors with epoch < current_epoch - 10 are rejected
  • Effort: M

P3.5 — Slashing

P3.5.1 — Equivocation Enforcement

  • Depends on: P3.1.2, P2.2.2
  • Input: θ-consensus.md (Slashing Conditions), theta extraction, ADR-003
  • Output: src/consensus/slashing.{ext}, tests/consensus/slashing.test.{ext}
  • Acceptance criteria:
    • apply_equivocation_slash(proof) → calls reputation penalty (P2.2.2) for double-signing node
    • Proof verification: check both contradicting votes carry valid signatures from same node
    • Slash amount: maps to critical offense (8000bps loss) in reputation penalty table
    • Idempotent: same equivocation proof applied twice must not slash twice (proof hash dedup)
    • Slashing recorded in reputation history table with event_id = proof hash
    • Integration test: create equivocation → verify slash applied → verify idempotency
  • Effort: M

Phase 4: μ Integrity Monitor

P4.1.1 — Coercion Trap Detection

  • Depends on: P1.3.1, P2.1.2
  • Input: docs/concepts/μ-integrity-monitor.md
  • Output: src/domains/integrity/coercion-detection.ts, tests/domains/integrity/coercion-detection.test.ts
  • Acceptance criteria:
    • Enumerate all legal actions for a participant given current state
    • For each action, compute outcome via rule engine
    • Flag if: all outcomes negative, or action space is empty
    • Severity levels: INFO, WARNING, CRITICAL
    • Advisory output: {check, result, severity, details, evidence, reasoning_trace}
    • No veto power: detection is advisory only, cannot block actions
  • Effort: L

P4.2.1 — Three Advisory Roles

  • Depends on: P4.1.1
  • Input: μ-integrity-monitor.md (advisory roles section)
  • Output: src/domains/integrity/advisory-roles.ts, tests/domains/integrity/advisory-roles.test.ts
  • Acceptance criteria:
    • Translator: sanitize natural language → structured commands
    • Sentinel: scan events for injection, coercion, axiom drift
    • Guide: explain reputation scores, available actions, consequences
    • All three are strictly read-only
    • Standard output format across all roles
  • Effort: L

Phase 5: ι Fork Protocol

P5.1.1 — Fork ID and Creation

  • Depends on: P3.1.2, P1.4.1
  • Input: docs/concepts/ι-state-fork.md, theta extraction (fork sections)
  • Output: src/domains/fork/index.{ext}, tests/domains/fork/index.test.{ext}
  • Acceptance criteria:
    • [ ] Fork ID = SHA-256(parent_fork_id   divergence_event_id   rule_hash   reason)
    • Auto triggers: rule conflict, invariant violation, constitutional violation
    • Manual triggers: voluntary exit, governance rejection
    • Isolation modes: ISOLATED (no data flow), READ_ONLY_PARENT (read parent, can’t write), BRIDGED (selective sync)
    • Fork-scoped state: event log, reputation, tokens, BFT state — all copied at fork point
  • Effort: XL

P5.2.1 — Checkpoint Protocol

  • Depends on: P5.1.1
  • Input: ι-state-fork.md (checkpoint section)
  • Output: src/domains/fork/checkpoints.{ext}, tests/domains/fork/checkpoints.test.{ext}
  • Acceptance criteria:
    • Frequency: every 1000 events OR 100 epochs (whichever first)
    • Signers: top 10 arbiters by reputation, threshold 7/10
    • Content: fork_id, epoch, event_count, state_root, reputation_snapshot, rule_version_hash
    • Fast sync: new nodes download checkpoint + post-checkpoint events
    • Checkpoint chain: each checkpoint references previous checkpoint hash
  • Effort: L

P5.3.1 — Fork Merge

  • Depends on: P5.1.1, P5.2.1
  • Input: ι-state-fork.md (merge section), theta extraction (fork merge)
  • Output: src/domains/fork/merge.{ext}, tests/domains/fork/merge.test.{ext}
  • Acceptance criteria:
    • Find common ancestor fork point
    • Compute event diff between forks
    • Conflict detection: same state key modified in both forks
    • Resolution strategies: timestamp ordering (default), reputation-weighted voting, governance vote
    • Rule conflicts: cannot auto-merge, require governance vote
    • Reputation discount on transition: 50% of source fork reputation (configurable)
    • Both forks must reach quorum agreement for merge to finalize
  • Effort: XL

Phase 6: π Governance

P6.1.1 — Proposal Lifecycle

  • Depends on: P1.3.1, P2.1.1, P3.1.2
  • Output: src/domains/governance/proposals.{ext}, tests/domains/governance/proposals.test.{ext}
  • Acceptance criteria:
    • Proposal types: AX (constitutional), PR (protected rule), GOV (protocol rule)
    • Voting: AX/PR require >80% supermajority, GOV requires >66% quorum
    • AX changes require 3-stage time-locked votes (30-day intervals)
    • Automatic activation at activation_epoch after vote passes
    • Appeal mechanism with cooldown period
  • Effort: XL

P6.2.1 — Governance Limits

  • Depends on: P6.1.1
  • Input: MASTER-TASKS.md P6.2 section
  • Output: src/domains/governance/limits.{ext}, tests/domains/governance/limits.test.{ext}
  • Acceptance criteria:
    • Max delta: ±10% per 6 months for any numeric parameter
    • Constitutional pegs: ±30% from genesis requires 3-stage supermajority
    • Cooldown: 1 epoch between changes to same parameter
    • Stability: max 2 parameters per domain changed simultaneously
    • Entropy injection: for >5% delta, 10% of votes (VRF-selected) count as equal-weight
  • Effort: L

P6.3.1 — Axiom Enforcement

  • Depends on: P6.1.1, P4.1.1
  • Input: promises.md (system guarantees), S01 spec (7 constitutional axioms)
  • Output: src/governance/axiom-enforcement.{ext}, tests/governance/axiom-enforcement.test.{ext}
  • Acceptance criteria:
    • AX-01 (Append-only): all delete operations rejected; corrections via new events
    • AX-02 (Derived reputation): no admin reset; only rule engine can change reputation
    • AX-03 (No absolute authority): all roles subject to consequences
    • AX-04 (Consequence windows): sanction intent → admission → voluntary → automated
    • AX-05 (Subjective finality): per local rule engine, not global consensus
    • AX-06 (Right to exit): fork allowed; penalty capped at 10%
    • AX-07 (Technical sovereignty): row-level security, no cross-workspace leakage
    • Each axiom has a guard function: check_axiom_N(proposed_action) → {pass, violation_details}
  • Effort: L

Phase 7: ξ Identity — Digital Soul Vector

Status: Spec-only. No implementation tasks until Phase 6 is complete. Concept: docs/3-world/social/identity.md Extraction: docs/reference/extractions/xi-identity-extraction.md

Phase 7 implements the Digital Soul Vector — a persistent identity fabric for agents and nodes.

P7.1.1 — Identity Schema

  • Depends on: P0.2.2, P2.1.1, P3.1.2
  • Input: docs/concepts/ξ-identity.md, xi-identity-extraction.md
  • Output: src/domains/identity/schema.ts, tests/domains/identity/schema.test.ts
  • Acceptance criteria:
    • 8 identity domains: contribution, governance, reputation, behavior, skills, relationships, history, sovereignty
    • 7 character traits (immutable at genesis): curiosity, reliability, fairness, courage, wisdom, creativity, empathy
    • Identity hash: SHA-256 of canonical genesis record (immutable after creation)
    • Soul-Bound Token (SBT) concept: identity cannot be transferred or sold
    • Schema stored in identities table with Ed25519 public key as primary identifier
  • Effort: L

P7.1.2 — Identity Binding (VRF + Ed25519)

  • Depends on: P7.1.1, ADR-002 decision
  • Input: xi-identity-extraction.md (binding section), ADR-002-vrf-implementation.md
  • Output: src/domains/identity/binding.ts, tests/domains/identity/binding.test.ts
  • Acceptance criteria:
    • Ed25519 keypair generation: generateIdentityKeyPair(){ publicKey, privateKey }
    • Identity proof: proveIdentity(privateKey, challenge) → Ed25519 signature
    • VRF integration: generateUnpredictableEntropy(privateKey, epoch) → VRF output
    • Binding is permanent: once public key registered, cannot be reassigned
    • Test: sign + verify roundtrip; VRF output deterministic for same inputs
  • Effort: L

P7.1.3 — Soul Vector Accumulation

  • Depends on: P7.1.1, P2.1.1, P3.1.2
  • Input: xi-identity-extraction.md (accumulation section)
  • Output: src/domains/identity/accumulator.ts, tests/domains/identity/accumulator.test.ts
  • Acceptance criteria:
    • Each completed task updates identity’s contribution domain
    • Each governance vote updates governance domain
    • Reputation scores flow from λ into identity’s reputation domain
    • getSoulVector(identityId) → 8-domain snapshot at current epoch
    • Soul vector is read-only via API; only rule engine can modify it (AX-02 protection)
  • Effort: L

Task Summary

Phase Tasks Effort Depends on
P0 Bootstrap 28 tasks 4-6 weeks
P1 κ Rule Engine 10 tasks 3-4 weeks P0
P2 λ Reputation 7 tasks 2-3 weeks P0, P1
P3 θ Consensus 7 tasks 5-6 weeks P0, P1, P2
P4 μ Integrity 2 tasks 2 weeks P1, P2, P3
P5 ι Fork 3 tasks 3-4 weeks P0, P3
P6 π Governance 3 tasks 3-4 weeks All
P7 ξ Identity 3 tasks 3-4 weeks P0, P2, P3, P6
Total 63 tasks 25-32 weeks  

R57 expansion: Phase 0 grew from 9 high-level bullets → 28 granular tasks with acceptance criteria. Total task count: 32 (pre-R57) → 63 tasks (post-R57). Phase 7 ξ Identity defined for the first time.

Dependency Graph (Mermaid)

graph TD
    P1.1.1[P1.1.1 Integer Math] --> P1.1.2[P1.1.2 Determinism Harness]
    P1.2.1[P1.2.1 Lexer] --> P1.2.2[P1.2.2 Parser]
    P1.2.2 --> P1.2.3[P1.2.3 Validator]
    P1.2.2 --> P1.3.1[P1.3.1 Eval Loop]
    P1.1.1 --> P1.3.1
    P1.3.1 --> P1.3.2[P1.3.2 Builtins]
    P1.3.1 --> P1.3.3[P1.3.3 State Access]
    P1.2.2 --> P1.4.1[P1.4.1 Version Hash]
    P1.4.1 --> P1.4.2[P1.4.2 Migration]
    P1.3.1 --> P1.4.2

    P1.1.1 --> P2.1.1[P2.1.1 Rep Schema]
    P2.1.1 --> P2.1.2[P2.1.2 Score Compute]
    P1.3.1 --> P2.1.2
    P2.1.1 --> P2.2.1[P2.2.1 Decay]
    P1.1.1 --> P2.2.1
    P2.1.1 --> P2.2.2[P2.2.2 Penalties]
    P2.1.1 --> P2.3.1[P2.3.1 Tokens]
    P2.1.2 --> P2.4.1[P2.4.1 Limits]
    P1.3.2 --> P2.4.1

    P1.4.1 --> P3.1.1[P3.1.1 Vote Messages]
    P3.1.1 --> P3.1.2[P3.1.2 Quorum]
    P3.1.2 --> P3.1.3[P3.1.3 View Change]
    P3.1.2 --> P3.2.1[P3.2.1 Finality SM]
    P3.1.1 --> P3.3.1[P3.3.1 Gossip]
    P3.1.1 --> P3.4.1[P3.4.1 Time Anchors]
    P2.1.1 --> P3.4.1

    P1.3.1 --> P4.1.1[P4.1.1 Coercion]
    P2.1.2 --> P4.1.1
    P4.1.1 --> P4.2.1[P4.2.1 Advisory Roles]

    P3.1.2 --> P5.1.1[P5.1.1 Fork Create]
    P1.4.1 --> P5.1.1
    P5.1.1 --> P5.2.1[P5.2.1 Checkpoints]
    P5.1.1 --> P5.3.1[P5.3.1 Fork Merge]
    P5.2.1 --> P5.3.1

    P1.3.1 --> P6.1.1[P6.1.1 Proposals]
    P2.1.1 --> P6.1.1
    P3.1.2 --> P6.1.1
    P6.1.1 --> P6.2.1[P6.2.1 Gov Limits]
    P6.1.1 --> P6.3.1[P6.3.1 Axiom Guards]
    P4.1.1 --> P6.3.1

Agent Execution Protocol

When an AI agent picks up a task from this list:

  1. Read the Input files listed for the task
  2. Create the Output files at the specified paths (adjust extension for chosen stack)
  3. Run all acceptance criteria as tests
  4. Record completion via task_update and thought_record
  5. Do not start a task whose dependencies are incomplete
  6. Do not use floating-point arithmetic in any κ/λ/θ computation path
  7. Do not introduce non-deterministic operations (random, clock, I/O) in rule evaluation

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.