Colibri — Deep Task Breakdown for Agent Execution

Purpose: Fine-grained task definitions for AI agent execution (all axes). Source: docs/colibri-system.md, concept docs (α–π), donor extractions (heritage-only). Stack: TypeScript 5.3+ · @modelcontextprotocol/sdk · Zod 4 · better-sqlite3 · Jest (ESM).

Quick Start for agents: This file describes what to do. For how to do it (copy-paste ready prompts), see task-prompts/ — each Phase 0 group (P0.1 through P0.9) has a corresponding prompt file. For the dependency DAG and critical path, see task-dependency-graph.md. For the round-by-round schedule (R74 → R80), see ../../5-time/roadmap.md.

Phase 0 shape

Phase 0 is 28 sub-tasks numbered P0.1.1 – P0.9.3. The numbering is locked; no slice may introduce new sub-tasks into this range. Numbering:

Group	Sub-tasks	Concept
P0.1	P0.1.1 – P0.1.4	Project infrastructure
P0.2	P0.2.1 – P0.2.4	α System Core
P0.3	P0.3.1 – P0.3.4	β Task Pipeline
P0.4	P0.4.1 – P0.4.2	γ Server Lifecycle
P0.5	P0.5.1 – P0.5.2	δ Model Router (Phase 0 stubs; full routing → Phase 1.5)
P0.6	P0.6.1 – P0.6.3	ε Skill Registry
P0.7	P0.7.1 – P0.7.3	ζ Decision Trail
P0.8	P0.8.1 – P0.8.3	η Proof Store
P0.9	P0.9.1 – P0.9.3	ν Integrations

Each sub-task has an acceptance criteria list, an effort estimate, and explicit dependencies. Every sub-task lists target files — none of them exist yet. colibri_code: none holds for all 15 Greek concepts.

Heritage note: Donor algorithm pseudocode for every concept is in docs/reference/extractions/. Those files are references, not copy sources. Colibri is a full rewrite; code is earned by the 5-step executor chain (audit → contract → packet → implement → verify), not transcribed.

Phase 0: Colibri Bootstrap (Execution + Intelligence Axis)

These tasks implement the Execution and Intelligence axes from scratch (TypeScript rewrite):

P0.1 — Project Infrastructure

P0.1.1 — Package Setup

Depends on: nothing
Output: package.json, tsconfig.json, .eslintrc.json, .prettierrc, .env.example
Acceptance criteria:
- package.json: "type": "module", "engines": {"node": ">=20"}, ESM-first
- TypeScript 5.3+: strict: true, target: ES2022, module: NodeNext
- tsx for dev; tsc for production build
- .env.example documents the Phase 0 COLIBRI_* floor (COLIBRI_DB_PATH, COLIBRI_LOG_LEVEL, COLIBRI_STARTUP_TIMEOUT_MS). Additional variables are earned by later concepts; no AMS_* variables.
- .gitignore excludes node_modules/, dist/, .env, data/colibri.db, data/ams.db
Effort: S

P0.1.2 — Test Runner + Linter

Depends on: P0.1.1
Input: nothing
Output: jest.config.ts, eslint.config.ts, src/__tests__/smoke.test.ts
Acceptance criteria:
- npm test runs Jest with ESM transform
- npm run lint runs ESLint with zero errors on empty codebase
- npm run build compiles TypeScript to dist/ with no errors
- Smoke test: smoke.test.ts asserts 1 + 1 === 2 (verifies test harness works)
- Code coverage report generated (--coverage flag)
Effort: S

P0.1.3 — CI Pipeline

Depends on: P0.1.2
Input: .github/workflows/ci.yml (existing; may need update)
Output: .github/workflows/ci.yml updated for TypeScript
Acceptance criteria:
- Runs on push to any branch and on PR to main
- Steps: npm ci → npm run lint → npm test → npm run build
- Node.js 20+ matrix
- Fails if any step fails
- Uploads coverage report artifact
Effort: S

P0.1.4 — Environment Validation

Depends on: P0.1.1
Output: src/config.ts, tests/config.test.ts
Acceptance criteria:
- Zod schema validates the Phase 0 COLIBRI_* floor on startup
- Missing required var → throws with human-readable message listing the missing key
- Optional vars have typed defaults (COLIBRI_LOG_LEVEL=info, COLIBRI_STARTUP_TIMEOUT_MS=30000)
- COLIBRI_LOG_LEVEL accepted values: silent | error | warn | info | debug
- NODE_ENV accepted values: development | test | production
- Export config object (typed, not raw process.env)
- Reading any AMS_* variable is a lint/test failure — the donor namespace is not supported
Effort: S

P0.2 — α System Core

P0.2.1 — MCP Server Bootstrap

Depends on: P0.1.2
Output: src/server.ts, tests/server.test.ts
Acceptance criteria:
- McpServer created with name: "colibri", version from package.json
- StdioServerTransport is the only transport in Phase 0 per S17. No HTTP, no WebSocket.
- Server exports registerTool(name, schema, handler) helper that composes the five-stage α middleware chain (tool-lock → schema validate → audit enter → dispatch → audit exit)
- At least 1 registered tool: server/ping → returns { status: "ok", version }
- npm test passes with MCP handshake integration test
Effort: M

P0.2.2 — SQLite Initialization

Depends on: P0.1.4
Input: docs/architecture/data-model.md §2 (earning rule), docs/reference/extractions/alpha-system-core-extraction.md (donor pseudocode, heritage-only)
Output: src/db/index.ts, src/db/schema.sql, tests/db/init.test.ts
Acceptance criteria:
- Uses better-sqlite3 (sync API)
- schema.sql ships only an empty header + the first migration slot. Tables are added by their owning concept’s P0 sub-task per docs/architecture/data-model.md §2 (β: tasks, task_transitions; ε: skills; ζ: thoughts, actions; η: merkle_nodes, merkle_roots; ν: sync_log). No “78 tables” target.
- initDb(path) function: creates DB if not exists, applies all numbered migrations in order, returns Database instance
- Idempotent: calling initDb() twice does not fail or duplicate data
- WAL mode enabled: PRAGMA journal_mode=WAL
- Foreign keys enabled: PRAGMA foreign_keys=ON
- PRAGMA integrity_check runs at startup and fails boot on any error
- Test: fresh DB passes integrity check
Effort: L

P0.2.3 — Two-Phase Startup

Depends on: P0.2.1, P0.2.2
Input: alpha-system-core-extraction.md (startup section), gamma-server-lifecycle-extraction.md
Output: src/startup.ts, tests/startup.test.ts
Acceptance criteria:
- Phase 1 (transport): MCP transport ready, health check responds, DB not yet loaded
- Phase 2 (heavy init): DB initialized, all tools registered, all domains loaded
- startup() returns only after Phase 2 completes
- If Phase 2 fails, server shuts down gracefully (no hanging process)
- Startup time logged: console.error("Startup complete in {ms}ms")
- Test: mock Phase 2 failure → verify clean shutdown
Effort: M

P0.2.4 — Health Check Tool

Depends on: P0.2.3
Input: nothing
Output: src/tools/health.ts, tests/tools/health.test.ts
Acceptance criteria:
- Tool name: server/health
- Returns: { status, version, uptime_ms, db_tables, phase, mode }
- db_tables: count of SQLite tables (verifies schema loaded correctly)
- [ ] phase: "phase1" "phase2"
- mode: current runtime mode string
- Response time < 100ms
Effort: S

P0.3 — β Task Pipeline

P0.3.1 — β Task Pipeline State Machine

Depends on: P0.2.2
Input: docs/colibri-system.md §6.3 (canonical FSM), docs/concepts/β-task-pipeline.md
Output: src/domains/tasks/state-machine.ts, tests/domains/tasks/state-machine.test.ts
Acceptance criteria:
- 7 states defined exactly as in colibri-system.md §6.3: INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE, with CANCELLED as a terminal side-branch reachable from any non-terminal state.
- Transition map matches the canonical diagram. Unlisted transitions throw InvalidTransitionError with {from, to, taskId}.
- transition(task, newState) → returns updated task or throws
- canTransition(from, to) → boolean (no side effects)
- DONE and CANCELLED are terminal; any transition out of them throws
- 100% branch coverage (all valid transitions, all invalid transitions, both terminal exits)
Effort: S

Heritage note: The AMS donor task store used a kanban-style lifecycle (backlog | todo | in_progress | blocked | review | done | cancelled). That vocabulary survives at the PM-facing level (see CLAUDE.md §5 — “Only todo tasks are executable”) while data/ams.db remains the task store during Phase 0 bootstrap. The β execution FSM inside Colibri is the canonical INIT..DONE pipeline above, not the donor lifecycle. Mapping between the two belongs to ν Integrations, not β.

P0.3.2 — Task CRUD

Depends on: P0.3.1
Input: beta-task-pipeline-extraction.md (CRUD section)
Output: src/domains/tasks/repository.ts, tests/domains/tasks/repository.test.ts
Acceptance criteria:
- createTask(input): inserts into tasks table, returns task with generated id (UUID v4)
- getTask(id): returns task or null
- updateTask(id, patch): partial update, returns updated task
- deleteTask(id): soft delete (sets deleted_at)
- listTasks({ status?, project_id?, limit?, offset? }): filtered + paginated
- All operations use better-sqlite3 prepared statements (no string interpolation)
- Test: CRUD roundtrip with all fields
Effort: M

P0.3.3 — Writeback Contract Enforcement

Depends on: P0.3.2
Input: beta-task-pipeline-extraction.md (writeback section)
Output: src/domains/tasks/writeback.ts, tests/domains/tasks/writeback.test.ts
Acceptance criteria:
- writebackRequired(taskId): returns true if task is done but lacks thought_record
- enforceWriteback(taskId): throws WritebackRequiredError if writeback not complete
- Runtime blocking: any tool that moves task to done MUST call enforceWriteback before returning
- WritebackRequiredError includes taskId, missing_fields[]
- Test: marking task done without thought_record → error; with thought_record → success
Effort: S

P0.3.4 — Task Tools (MCP surface)

Depends on: P0.3.3, P0.2.1
Input: beta-task-pipeline-extraction.md (tools section)
Output: src/tools/tasks.ts, tests/tools/tasks.test.ts
Acceptance criteria:
- task_create tool: Zod input schema, calls createTask, returns task
- task_get tool: returns task or { error: "not_found" }
- task_update tool: partial update; validates new state via FSM
- task_list tool: supports status, limit, offset filters
- task_next_actions tool: returns list of unblocked todo tasks sorted by priority
- task_update with status: "done" triggers writeback enforcement
- All tools have Zod input validation (invalid input → structured error, not crash)
Effort: M

P0.4 — γ Server Lifecycle

P0.4.1 — Runtime Mode Enum

Depends on: P0.1.4
Input: docs/concepts/γ-server-lifecycle.md, docs/reference/extractions/gamma-server-lifecycle-extraction.md (heritage-only)
Output: src/modes.ts, tests/modes.test.ts
Acceptance criteria:
- 4 modes in Phase 0: FULL | READONLY | TEST | MINIMAL. (The donor WATCH mode is excluded — file watching is not in Phase 0 α scope.)
- detectMode(): reads COLIBRI_MODE env var, defaults to FULL
- Each mode has a capability set: { canWrite, canRunTests, heavyInit }
- READONLY: canWrite=false, all write tools return { ok: false, error: { code: "ERR_READONLY" } }
- MINIMAL: no heavy init; only unified_vitals is registered
- Test: all 4 modes have correct capability sets
- No AMS_MODE fallback — donor namespace is not supported
Effort: S

P0.4.2 — Graceful Shutdown

Depends on: P0.2.3
Input: gamma-server-lifecycle-extraction.md (shutdown section)
Output: src/shutdown.ts, tests/shutdown.test.ts
Acceptance criteria:
- registerShutdownHandler(fn): registers a cleanup function
- On SIGINT / SIGTERM: calls all handlers in reverse registration order
- DB connection closed before process exit
- In-flight MCP requests allowed to complete (max 5s timeout then force-exit)
- Exit code 0 on clean shutdown, 1 on error during shutdown
- Test: mock SIGTERM → verify DB close + handler called
Effort: S

P0.5 — δ Model Router

Phase 0 note: Phase 0 library stubs shipped in R75 Wave I per ADR-005 §Decision. The router interface is present (scoring + fallback modules, single-row candidate table); scoring returns a constant (claude: 1.0), fallback has one member (Claude), adapter is Anthropic-only. Full multi-model scoring, N-member fallback, and circuit breaker land in Phase 1.5. No δ-facing MCP tools in the Phase 0 14-tool surface — router_* tools are deferred to Phase 1.5.

P0.5.1 — Intent Scoring Matrix

Status: Phase 0 stub shipped (R75 Wave I, PR #149)
Input: docs/reference/extractions/delta-model-router-extraction.md (heritage-only — algorithm source)
Output: src/domains/router/scoring.ts, src/__tests__/domains/router/scoring.test.ts
Phase 0 shipped: src/domains/router/scoring.ts returns a constant vector ({ claude: 1.0 }) for every input. The module signature matches the Phase 1.5 target so Phase 1.5 is a formula replacement, not an interface rewrite.
Design contract (Phase 1.5):
- scoreIntent(prompt, context) → { scores: Record<ModelId, number>, winner: ModelId } — Phase 1.5: real scoring factors
- Scoring factors: prompt length, complexity keywords, context size, tool requirements — Phase 1.5
- All scores in range [0, 100] (integer) — Phase 1.5
- Deterministic: same input always returns same winner (Phase 0 stub is trivially deterministic — always returns Claude)
- Pure function (no external API calls) — invariant from Phase 0
Effort: M

P0.5.2 — Model Fallback Chain

Status: Phase 0 stub shipped (R75 Wave I, PR #150)
Input: docs/reference/extractions/delta-model-router-extraction.md (heritage-only — algorithm source)
Output: src/domains/router/fallback.ts, src/__tests__/domains/router/fallback.test.ts
Phase 0 shipped: src/domains/router/fallback.ts implements a single-member chain — if Claude fails, the call fails; no cascade. Matches ADR-005 §Decision (“fallback chain has one member”).
Design contract (Phase 1.5):
- Model slots configured via COLIBRI_MODEL_* env vars (count TBD by Phase 1.5) — Phase 1.5
- routeRequest(prompt, context): tries models in priority order — Phase 1.5 multi-member
- On model error / timeout: tries next model in chain — Phase 1.5
- On exhaustion: throws AllModelsFailedError with per-model error log — Phase 1.5 (Phase 0 raises ModelUnavailable on single-member exhaustion)
- Circuit breaker: model marked unavailable for 60s after 3 consecutive failures — Phase 1.5
- No AMS_MODEL_* fallback — donor namespace is not supported (invariant from Phase 0)
Effort: M

P0.6 — ε Skill Registry

P0.6.1 — Skill Schema

Depends on: P0.2.2
Input: epsilon-skill-registry-extraction.md
Output: src/domains/skills/schema.ts, tests/domains/skills/schema.test.ts
Acceptance criteria:
- Zod schema: { name, description, version, entrypoint, capabilities[], greekLetter? }
- name must be kebab-case: /^[a-z][a-z0-9-]+$/
- capabilities enum: ["read", "write", "spawn", "audit", "admin"]
- greekLetter optional: must be one of α β γ δ ε ζ η θ ι κ λ μ ν ξ π
- SKILL.md parser: reads frontmatter + body from existing .agents/skills/*/SKILL.md
- Test: parse all 22 existing skill files, assert zero schema errors
Effort: M

P0.6.2 — Skill CRUD + Discovery

Depends on: P0.6.1
Input: docs/concepts/ε-skill-registry.md, docs/reference/extractions/epsilon-skill-registry-extraction.md (heritage-only)
Output: src/domains/skills/repository.ts, tests/domains/skills/repository.test.ts
Acceptance criteria:
- On startup: scans .agents/skills/*/SKILL.md, parses frontmatter, loads all valid skills into the skills table
- getSkill(name) → skill or null
- listSkills({ search?, capability? }) → filtered list
- skill_list MCP tool (the only ε Phase 0 MCP tool): returns all loaded skills with frontmatter metadata — see S17 §1 Category 3.
- skill_get, skill_reload, hot-reload — not in Phase 0. Deferred to Phase 1.
Effort: M

P0.6.3 — Skill Capability Index

Depends on: P0.6.2
Input: docs/concepts/ε-skill-registry.md
Output: src/domains/skills/capabilities.ts, tests/domains/skills/capabilities.test.ts
Acceptance criteria:
- listByCapability(capability) → skills that declare the capability in frontmatter
- Capability strings are treated as opaque tags (Phase 0 does not enforce an enum)
- Startup warning if any SKILL.md declares a capability not used anywhere else (drift detector)
- Test: seed 3 fake skills with overlapping capabilities; verify filter
Effort: S

Heritage note: The donor ε module supported skill_get, skill_reload, and a spawnAgent sub-process helper. None of these are in Phase 0. There is no src/domains/agents/ directory in the Phase 0 target tree (CLAUDE.md §9.1); agent spawning is deferred to Phase 1.5 with δ Model Router per ADR-005. Phase 0 ε ships the SKILL.md parser and one MCP tool (skill_list).

P0.7 — ζ Decision Trail

P0.7.1 — Hash-Chained Record Schema

Depends on: P0.2.2
Input: zeta-decision-trail-extraction.md
Output: src/domains/trail/schema.ts, tests/domains/trail/schema.test.ts
Acceptance criteria:
- Record schema: { id, type, task_id, agent_id, content, timestamp, prev_hash, hash }
- 4 valid types: plan | analysis | decision | reflection
- hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))
- Canonical JSON: sorted keys, no whitespace (deterministic)
- First record: prev_hash = "0000...0000" (64 zeros)
- Test: two records with identical inputs produce identical hashes
Effort: S

P0.7.2 — Thought Record CRUD

Depends on: P0.7.1
Input: zeta-decision-trail-extraction.md (CRUD section)
Output: src/domains/trail/repository.ts, tests/domains/trail/repository.test.ts
Acceptance criteria:
- createThoughtRecord(input): computes hash, links to previous record’s hash
- getThoughtRecord(id): returns record with hash
- listThoughtRecords({ task_id?, limit? }): returns chain in insertion order
- thought_record MCP tool: Zod input, returns record with computed hash
- thought_record_list MCP tool: returns chain for given task_id
Effort: M

P0.7.3 — Chain Verification Tool

Depends on: P0.7.2
Input: zeta-decision-trail-extraction.md (verification section)
Output: src/domains/trail/verifier.ts, tests/domains/trail/verifier.test.ts
Acceptance criteria:
- verifyChain(records[]): iterates chain, recomputes each hash, checks links
- Returns: { valid: bool, first_broken_at?: id, broken_count: number }
- audit_verify_chain MCP tool: calls verifyChain on full DB chain
- Test: tamper with one record’s content → verify valid: false at correct position
- Test: intact 100-record chain → valid: true in < 500ms
Effort: S

P0.8 — η Proof Store

P0.8.1 — Merkle Tree Construction

Depends on: P0.7.2
Output: src/domains/proof/merkle.ts, tests/domains/proof/merkle.test.ts
Acceptance criteria:
- Uses merkletreejs package (SHA-256 leaves)
- buildMerkleTree(recordHashes[]) → { root, tree }
- generateProof(tree, leafHash) → proof array
- verifyProof(root, proof, leafHash) → boolean
- Empty tree: root = SHA-256(“”) (defined constant)
- Test: 10-leaf tree → root is deterministic; membership proof verifies correctly
Effort: S

P0.8.2 — Three-Zone Retention

Depends on: P0.8.1, P0.2.2
Input: eta-proof-store-extraction.md (retention section)
Output: src/domains/proof/retention.ts, tests/domains/proof/retention.test.ts
Acceptance criteria:
- Hot zone: last 100 records — full content in DB
- Warm zone: records 101–1000 — content compressed (JSON → gzip → base64)
- Cold zone: records 1001+ — content hash only (full content deleted)
- archiveRecord(id): moves record to next zone based on age/position
- retrieveRecord(id): decompresses if Warm, returns hash stub if Cold
- Test: hot → warm → cold transitions; verify content availability per zone
Effort: M

P0.8.3 — Merkle Root Finalization Tool

Depends on: P0.8.1
Input: eta-proof-store-extraction.md (finalization section)
Output: src/tools/merkle.ts, tests/tools/merkle.test.ts
Acceptance criteria:
- merkle_finalize MCP tool: builds Merkle tree of last N unfinalized records, stores root
- merkle_root MCP tool: returns current root hash + record count + timestamp
- audit_session_start MCP tool: creates audit session record, returns session_id
- Finalization must happen AFTER final thought record (enforced: errors if no thought_record in session)
- Test: finalize 5-record session → root matches manual computation
Effort: S

P0.9 — ν Integrations

P0.9.1 — MCP Bridge

Depends on: P0.2.1
Input: nu-integrations-extraction.md (bridge section)
Output: src/domains/integrations/mcp-bridge.ts, tests/domains/integrations/mcp-bridge.test.ts
Acceptance criteria:
- McpBridge: wraps outbound MCP client calls to external servers
- connectToServer(url): creates client, returns connected bridge
- callTool(bridge, name, args): calls remote tool, returns result
- Timeout: 30s default, configurable via COLIBRI_MCP_TIMEOUT
- Retry: 3 attempts with exponential backoff on transient errors
- Test: mock MCP server → verify roundtrip tool call
Effort: M

P0.9.2 — Claude API Wrappers

Depends on: P0.1.4
Input: nu-integrations-extraction.md (Claude API section)
Output: src/domains/integrations/claude.ts, tests/domains/integrations/claude.test.ts
Acceptance criteria:
- createCompletion(prompt, options): calls Anthropic API with configured model
- createCompletionWithTools(prompt, tools, options): tool-use completion
- API key from ANTHROPIC_API_KEY env var (declared .optional() in the Zod schema; validated at call-time by createCompletion / createCompletionWithTools, which throw AnthropicConfigError if absent. This is Design Invariant 5 — the server boots cleanly when the key is unset for deployments that don’t use the Claude API integration. R75 Wave H reconciled this acceptance criterion with the shipped code in src/config.ts:79 + src/domains/integrations/claude.ts.)
- Rate limit handling: 429 → exponential backoff, max 3 retries
- All API calls logged with: model, prompt_tokens, completion_tokens, latency_ms
- Test (mock): verify retry logic and logging
Effort: M

P0.9.3 — Notification Channels

Depends on: P0.2.1
Input: nu-integrations-extraction.md (notifications section)
Output: src/domains/integrations/notifications.ts, tests/domains/integrations/notifications.test.ts
Acceptance criteria:
- notify(event, payload): dispatches event to configured channels
- Channels: log (always on), mcp (MCP notification), webhook (optional)
- COLIBRI_WEBHOOK_URL env var enables webhook channel (Phase 0 uses the COLIBRI_* namespace; AMS_* is not read)
- Events: task.completed, merkle.finalized, error.critical (no agent.spawned — agent runtime is deferred per ADR-005)
- Fire-and-forget: notification failures do not block main execution
- Test: verify each channel receives correct payload for each event type
Effort: S

How to Read This Document

Each task has:

ID: P{phase}.{subtask}.{step} (e.g., P1.2.3)
Depends on: which tasks must be complete first
Input: what the agent reads before starting
Output: exact files to create or modify
Acceptance criteria: testable conditions (pass/fail)
Estimated effort: S (1-2h), M (4-8h), L (1-2d), XL (3-5d)

Phase 1: κ Rule Engine

Phase 1 starts at R81 per docs/5-time/roadmap.md. Ready-to-paste agent prompts for every sub-task below live in task-prompts/p1.1-kappa-rule-engine.md (shipped R76.P1, 2026-04-18).

Structural overview: 5 groups × 20 sub-tasks. All output paths target src/domains/rules/... (Phase 0 convention). Concept reference: docs/3-world/physics/laws/rule-engine.md. Algorithm extraction: docs/reference/extractions/kappa-rule-engine-extraction.md.

P1.1 — Integer Math Library (3 sub-tasks)

P1.1.1 — Basis Point Arithmetic

Depends on: nothing
Input: docs/3-world/physics/laws/rule-engine.md (Integer-only arithmetic section), docs/reference/extractions/kappa-rule-engine-extraction.md §3–4
Output: src/domains/rules/integer-math.ts, src/domains/rules/__tests__/integer-math.test.ts
Acceptance criteria:
- All arithmetic uses 64-bit signed integers, no floating point anywhere
- bps_mul(value, bps) → (value * bps) / 10000 (floor division)
- bps_div(value, bps) → (value * 10000) / bps (floor division)
- apply_bps(value, bps) → value - bps_mul(value, bps) (decay variant)
- decay(value, rate_bps, epochs) → multi-epoch compounded decay with per-step floor
- Overflow detection: reject inputs where value * bps would exceed 2^63 - 1
- Underflow: result never goes below 0 for non-negative inputs
- Division by zero: explicit error, not silent wrap
- 100% branch coverage in tests
Effort: S

P1.1.2 — Determinism Verification Harness

Depends on: P1.1.1
Input: rule-engine.md (Forbidden operations section)
Output: src/domains/rules/__tests__/determinism.test.ts
Acceptance criteria:
- Property test: for any two runs with identical inputs, outputs are bit-identical
- No Math.random(), Date.now(), process.hrtime(), or equivalent
- No async I/O in computation path
- Fuzz test: 10,000 random input pairs produce identical results under both call orderings
- Static analysis: grep check rejects any use of Math.* or Date.* outside tests
Effort: S

P1.1.3 — BPS Constants + Overflow Protection

Depends on: P1.1.1
Input: rule-engine.md (basis-point conventions), extraction §3 (BPS Constants) + §4 (Overflow Protection)
Output: src/domains/rules/bps-constants.ts, src/domains/rules/__tests__/bps-constants.test.ts
Acceptance criteria:
- Exported constants: BPS_100_PERCENT=10000, BPS_50_PERCENT=5000, BPS_1_PERCENT=100
- Domain decay rates: DECAY_EXECUTION=500, DECAY_COMMISSIONING=300, DECAY_ARBITRATION=1000, DECAY_GOVERNANCE=200, DECAY_SOCIAL=100
- Penalty constants: DAMAGE_MINOR=1500, DAMAGE_MODERATE=3000, DAMAGE_SEVERE=5000, DAMAGE_CRITICAL=8000, DAMAGE_FRAUD=10000
- safe_mul(a, b) returns typed overflow error when |a| > MAX_INT64 / |b|
- safe_div(a, b) returns typed divide-by-zero error when b == 0
- Constants are as const, not let or mutable
Effort: S

P1.2 — DSL Parser (4 sub-tasks)

P1.2.1 — Lexer / Tokenizer

Depends on: nothing
Input: rule-engine.md (DSL grammar section), extraction §1 (Full EBNF Grammar), ADR-006-dsl-grammar.md
Output: src/domains/rules/lexer.ts, src/domains/rules/__tests__/lexer.test.ts
Acceptance criteria:
- Uses Chevrotain library (pinned in package.json per ADR-006)
- Token types: KEYWORD, IDENTIFIER, INTEGER, STRING, OPERATOR, DELIMITER, EOF
- Keywords: rule, guards, effects, when, then, if, else, and, or, not, true, false, admit, reject, admission, transition, consequence, promotion
- Operators: ==, !=, >, <, >=, <=, +, -, *, /, %
- Variables: start with $, dot-path dereference $actor.reputation.execution
- Line/column tracking for error messages
- Rejects floating-point literals (e.g., 3.14 is a syntax error)
- Rejects underscore-separated integer literals (1_000_000 invalid)
- Unicode identifiers supported (for future i18n)
Effort: M

P1.2.2 — Parser (Tokens → AST)

Depends on: P1.2.1
Input: rule-engine.md (DSL grammar), extraction §1 (EBNF) + §2 (AST Node Types), ADR-006
Output: src/domains/rules/parser.ts, src/domains/rules/__tests__/parser.test.ts
Acceptance criteria:
- Chevrotain parser built on top of P1.2.1 lexer
- Parses the 4 rule types: Admission, StateTransition, Consequence, Promotion
- AST node types match extraction §2: RuleNode, GuardClause, EffectCall, BinaryOp, UnaryOp, LogicalOp, IntLiteral, BoolLiteral, StringLiteral, VarRef, FuncCall
- Operator precedence: NOT > AND > OR; *///% > +/-; comparison > logical
- Guard blocks: guards { <clauses> }; each clause (Expression | else) -> (admit | reject STRING)
- Effect blocks: effects { <calls> }; each call IDENTIFIER ( ArgList )
- Error recovery: recoveryEnabled: true; reports first 5 errors, doesn’t crash on malformed input
- Round-trip test: parse → serialize → parse produces identical AST
- AST cap enforcement: rejects any single rule with > 10,000 AST nodes at parse time
Effort: L

P1.2.3 — AST Validator

Depends on: P1.2.2
Input: rule-engine.md (Forbidden operations table)
Output: src/domains/rules/validator.ts, src/domains/rules/__tests__/validator.test.ts
Acceptance criteria:
- Rejects rules that read local state (clock, filesystem, network, process)
- Rejects rules with randomness (except VRF-input references — $vrf_output)
- Rejects rules with side effects (HTTP calls, file writes, stdout)
- Rejects rules that mutate input events
- Type checking: operands compatible with operators (no int + string)
- Scope checking: variables defined before use
- Cycle detection: no infinite recursion in rule references
- Axiom pre-check: rejects rules that violate AX-01–AX-07 at load time (constitutional axioms)
Effort: M

P1.2.4 — Rule Loader / Registry

Depends on: P1.2.3
Input: rule-engine.md (Rule application algorithm), extraction §8 (process_action)
Output: src/domains/rules/registry.ts, src/domains/rules/__tests__/registry.test.ts
Acceptance criteria:
- loadRuleset(source: string): RuleRegistry — parses + validates + indexes a source file
- Registry sorts rules by specificity: (a) guard term count descending, (b) declaration order
- Specificity ties at load time → explicit AmbiguousRulesetError — refuse boot
- getRule(name: string): RuleNode | null — named lookup
- getByTransitionType(type: TransitionType): RuleNode[] — indexed lookup by one of the 13 transition types from extraction §7
- computeVersionHash(): string — delegates to P1.5.1 canonical serializer
- Load-time error aggregation: reports all validator errors in one pass, not just first
Effort: M

P1.3 — Deterministic Interpreter (4 sub-tasks)

P1.3.1 — Core Evaluation Loop

Depends on: P1.2.2, P1.1.1
Input: rule-engine.md (Rule application algorithm, Evaluation budget), extraction §5 (Rule Execution Flow)
Output: src/domains/rules/engine.ts, src/domains/rules/__tests__/engine.test.ts
Acceptance criteria:
- Evaluates AST nodes recursively with an immutable context
- Rule execution order: Admission → StateTransition → Consequence → Promotion (fixed)
- Within each category: alphabetical by rule name (stable ordering)
- First-match-wins: once a guard matches, remaining guards in the same rule are skipped
- Context contains: event, current_state (read-only snapshot), rule_version, epoch, actor binding
- Returns: list of {type, target, field, old_value, new_value} mutations
- No mutations applied during evaluation (collect-then-apply pattern)
- Timeout: MAX_INTEGER_OPS=10_000 — abort with RuleBudgetExceeded("integer_ops")
- Depth cap: MAX_CALL_DEPTH=16 — abort with RuleBudgetExceeded("call_depth")
- Arg cap: MAX_ARG_COUNT=8 — abort with RuleBudgetExceeded("arg_count")
Effort: L

P1.3.2 — Built-in Functions

Depends on: P1.3.1, P1.1.1
Input: rule-engine.md (Built-in functions table), extraction §3 (8 Built-in Functions)
Output: src/domains/rules/builtins.ts, src/domains/rules/__tests__/builtins.test.ts
Acceptance criteria:
- min(a, b), max(a, b), abs(a), cap(v, m) — integer only
- clamp(v, lo, hi) — max(lo, min(v, hi))
- isqrt(n) — Newton’s method integer square root (from extraction §3 pseudocode)
- ilog2(n) — integer floor of log base 2
- decay(v, rate_bps) — single-epoch decay; delegates to integer-math library
- diminishing(v, k) — (v * k) / (k + v) diminishing-returns transform
- bps_mul(v, b), bps_div(v, b) — delegate to P1.1.1
- hash(data) — SHA-256 hex string
- vrf_verify(pk, proof, input) — VRF proof verification per ADR-002
- All functions are pure (no side effects, same input = same output)
- Each function counts as 1 or more integer ops against the evaluation budget (documented table)
Effort: M

P1.3.3 — State Access Layer

Depends on: P1.3.1
Input: rule-engine.md (State Access Pattern), extraction §10 (ReadOnlyState Interface)
Output: src/domains/rules/state-access.ts, src/domains/rules/__tests__/state-access.test.ts
Acceptance criteria:
- Read-only state snapshot provided to rules (frozen object or copy-on-write proxy)
- State keys: reputation[node][domain], tokens[node], stake[node], epoch, event_count, fork_id, rule_version
- with_binding(name, value) returns new context; original unchanged
- No direct database access from rules — state is pre-loaded by the host
- State diff output: {key, old_value, new_value} for each mutation
- Merkle proof generation for state reads (verifiable by other nodes) — hooks into η
- Mutation attempts throw ReadOnlyStateError immediately (fail-fast)
Effort: M

P1.3.4 — Policy Gating / Pre-guards

Depends on: P1.3.1
Input: extraction §9 (Policy Gating: check_policy)
Output: src/domains/rules/policy-gate.ts, src/domains/rules/__tests__/policy-gate.test.ts
Acceptance criteria:
- Policy enum P1–P13 per extraction §9
- Each policy is a pure DSL expression (reuses P1.2.2 parser + P1.3.1 evaluator)
- check_policy(id, actor, context) → {admitted, reason?}
- check_all_policies(action, actor, context) → short-circuits on first failure
- Policies run BEFORE named rule evaluation (pre-guards)
- Policies share evaluation budget with named rules (same 10k op cap)
- Each policy has rejection reason pre-registered (no dynamic strings)
Effort: M

P1.4 — Admission Layer (4 sub-tasks)

P1.4.1 — Admission Evaluator

Depends on: P1.3.1, P1.3.4, P1.2.4
Input: rule-engine.md (Admission layer), docs/spec/s10-admission.md
Output: src/domains/rules/admission.ts, src/domains/rules/__tests__/admission.test.ts
Acceptance criteria:
- evaluateAdmission({caller, tool, mode, rep_snapshot, rule_version}): AdmissionResult
- Returns {admitted: true, effect_mutations: [...]} | {admitted: false, reason: DenialReason}
- Runs policy pre-guards first (P1.3.4), then named rules (P1.3.1)
- Rule version stamp on every admission record
- Pure function — no DB writes, no network calls
- Timing independent of input values (constant-time comparison for sensitive fields)
- Integration test: ≥20 representative (caller, tool, mode) tuples with expected verdicts
Effort: L

P1.4.2 — Denial Reason Taxonomy

Depends on: P1.4.1
Input: rule-engine.md (Rule application algorithm return values), extraction §5 (Rule Execution Flow)
Output: src/domains/rules/denial-reasons.ts, src/domains/rules/__tests__/denial-reasons.test.ts
Acceptance criteria:
- Typed discriminated union: no_rule_matched, budget:integer_ops, budget:call_depth, budget:arg_count, effect_invariant_violated, axiom_violation:AX-01..AX-07, policy:P1..P13, rule_version_mismatch, ambiguous_ruleset
- Each reason carries a structured details payload (no freeform strings)
- Reason codes stable across upgrades (additive-only changes; no renumbering)
- toString(reason) produces operator-readable rendering
- JSON serialization preserves discriminant tag
Effort: S

P1.4.3 — Admission Budgets

Depends on: P1.3.1
Input: rule-engine.md (Evaluation budget, Default budget constants)
Output: src/domains/rules/budget.ts, src/domains/rules/__tests__/budget.test.ts
Acceptance criteria:
- Budget tracker class with counters: integer_ops, call_depth, current_arg_count
- Limits: MAX_INTEGER_OPS=10_000, MAX_CALL_DEPTH=16, MAX_ARG_COUNT=8 (constants from rule-engine.md)
- On exceed: throw RuleBudgetExceeded with which-counter-fired field
- Instrumentation hooks: emit budget.tick events for α’s audit layer (count only, no payload)
- Budget state is reset per-rule (no leaking across rules in a ruleset)
- Limits part of the rule version hash (P1.5.1) — changing them forces a new version
Effort: M

P1.4.4 — Tool-Lock Integration Spec

Depends on: P1.4.1, P1.4.2, P1.4.3
Input: rule-engine.md (Admission layer), docs/2-plugin/middleware.md (5-stage wrapper at α), docs/spec/s10-admission.md
Output: src/domains/rules/tool-lock-adapter.ts, src/domains/rules/__tests__/tool-lock-adapter.test.ts
Acceptance criteria:
- createToolLockAdapter(ruleRegistry): MiddlewareStage — factory
- Output is a stage-1 middleware function signature matching α’s 5-stage wrapper contract (tool-lock → schema-validate → audit-enter → dispatch → audit-exit)
- Admission denials short-circuit the middleware chain (stages 2–5 skipped)
- Denials emit structured event to audit layer before returning
- Integration test: wire a test ruleset into a test server; verify admission decisions end-to-end
- Zero registration with server boot in R76 — this sub-task lands the adapter; α’s src/server.ts wiring is a separate R81+ PR
Effort: M

P1.5 — Governance / Rule Versioning (5 sub-tasks)

P1.5.1 — Version Hash Computation

Depends on: P1.2.2, P1.5.4
Input: rule-engine.md (Rule versioning section)
Output: src/domains/rules/versioning.ts, src/domains/rules/__tests__/versioning.test.ts
Acceptance criteria:
- computeVersionHash(ruleset, engine_version): string — returns hex SHA-256
- Hash input: canonical_serialization(all_rules) || engine_version
- Canonical serialization via P1.5.4 (sorted-key deterministic JSON)
- Version stored in event metadata: {rule_version: "sha256:abc..."}
- Version mismatch detection: events with wrong rule_version are rejected via P1.4.2 taxonomy code rule_version_mismatch
- Test: two logically-equivalent but differently-ordered rulesets produce identical hash (canonical property)
Effort: S

P1.5.2 — Rule Migration

Depends on: P1.5.1, P1.3.1, P1.5.5
Input: rule-engine.md (Test corpus parity requirement), docs/3-world/physics/enforcement/governance.md (π versioning section, when landed)
Output: src/domains/rules/migration.ts, src/domains/rules/__tests__/migration.test.ts
Acceptance criteria:
- migrateRuleset(old, new, corpus): MigrationResult — runs parity harness (P1.5.5)
- Test corpus: ≥100 representative events
- Activation epoch: new rules take effect at epoch N+1 (not immediately)
- Parity requirement: h_old == h_new for every corpus event both versions admit
- Divergence set must match proposal’s declared scope or migration is rejected
- Rollback: if parity fails, old ruleset remains active and proposal is marked rejected:parity
- Fork trigger: nodes that reject migration automatically fork (link to ι — deferred to Phase 5 wiring)
Effort: L

P1.5.3 — Activation Epoch + Rollback

Depends on: P1.5.1, P1.5.2
Input: rule-engine.md (Rule versioning), docs/spec/s11-rule-engine.md
Output: src/domains/rules/activation.ts, src/domains/rules/__tests__/activation.test.ts
Acceptance criteria:
- scheduleActivation(new_version, target_epoch): ActivationToken — target_epoch must be current_epoch + 1 minimum
- applyActivation(token, current_epoch): void — applies only when current_epoch >= target_epoch
- rollback(version) — reinstates prior version; emits rollback event
- Activation journal: append-only log of (epoch, version_hash, cause) tuples
- Rollback does not retroactively invalidate events admitted under rolled-back version — those stand
- Rollback during dispute window triggers π governance review hook (hook name only — π not implemented)
Effort: M

P1.5.4 — Canonical Serialization

Depends on: P1.2.2
Input: rule-engine.md (Rule versioning: “canonical serialization of the rule bodies”)
Output: src/domains/rules/canonical.ts, src/domains/rules/__tests__/canonical.test.ts
Acceptance criteria:
- canonicalize(ast_or_ruleset): string — produces byte-identical output on any platform
- Keys sorted alphabetically at every object level
- No whitespace (single-line JSON)
- Integer literals preserved exactly (no 1e3 normalization, no leading zeros)
- String escapes use canonical JSON form (\", \n, \u00XX)
- Property test: canonicalize(parse(canonicalize(parse(x)))) == canonicalize(parse(x)) — idempotent round-trip
- No locale dependence (sort uses codepoint order, not locale-aware collation)
Effort: M

P1.5.5 — Test Corpus Parity Harness

Depends on: P1.3.1, P1.5.1
Input: rule-engine.md (Test corpus parity requirement)
Output: src/domains/rules/parity-harness.ts, src/domains/rules/__tests__/parity-harness.test.ts
Acceptance criteria:
- runParity({old_ruleset, new_ruleset, corpus}): ParityReport
- Per event: compute effect-set hash h = SHA-256(canonical(effects)) under both versions
- Report categorizes events: both_admit_same, both_admit_diverge, old_admit_new_reject, old_reject_new_admit, both_reject
- Pass condition: both_admit_diverge set is empty AND (old_admit_new_reject ∪ old_reject_new_admit) ⊆ declared scope
- Default corpus of ≥100 hand-curated events shipped with the harness
- Deterministic: identical inputs → identical report bytes
- Performance: runs 10k corpus events in < 5 seconds (for CI feedback speed)
Effort: L

Phase 2: λ Reputation

P2.1 — Domain Structure

P2.1.1 — Reputation Record Schema

Depends on: P1.1.1
Input: docs/concepts/λ-reputation.md, docs/guides/implementation/lambda-reputation.md
Output: src/domains/reputation/schema.{ext}, database migration
Acceptance criteria:
- 5 domains: execution, commissioning, arbitration, governance, social
- Per record: node_id, domain, score (integer bps 0-10000), scars (bitmask), ban_until_epoch, last_activity_epoch
- History table: node_id, domain, epoch, delta, reason, event_id
- Indexes on (node_id, domain) and (domain, score DESC)
Effort: S

P2.1.2 — Score Computation

Depends on: P2.1.1, P1.3.1
Input: λ-reputation.md (Computation section)
Output: src/domains/reputation/compute.{ext}, tests/domains/reputation/compute.test.{ext}
Acceptance criteria:
- compute_score(node_id, domain, events[]) → integer score
- Score = Σ(acknowledgement_weight × event_outcome) for all events in domain
- Uses integer-math library for all arithmetic
- Score capped at 10000 bps (100%) minus scar penalties
- Property test: score is monotonically non-decreasing with only positive events
Effort: M

P2.2 — Decay and Penalties

P2.2.1 — Exponential Decay

Depends on: P2.1.1, P1.1.1
Input: λ-reputation.md (Decay section), implementation guide
Output: src/domains/reputation/decay.{ext}, tests/domains/reputation/decay.test.{ext}
Acceptance criteria:
- Decay applied per-epoch for inactive nodes
- Rate per domain: execution=500bps, commissioning=300bps, arbitration=1000bps, governance=200bps, social=100bps
- Formula: new_score = score - apply_bps(score, decay_rate) per inactive epoch
- Activity in domain resets that domain’s decay counter
- Batch processing: efficient for 10,000+ nodes per epoch
- Floor: score cannot go below 0
Effort: M

P2.2.2 — Offense Penalties

Depends on: P2.1.1, P1.1.1
Input: λ-reputation.md (Offense Penalties section)
Output: src/domains/reputation/penalties.{ext}, tests/domains/reputation/penalties.test.{ext}
Acceptance criteria:
- Penalty table: minor=1500bps, moderate=3000bps, severe=5000bps, critical=8000bps, fraud=10000bps
- Scar mechanism: fraud adds permanent cap reduction (score can never exceed 10000 - scar_bps)
- Ban mechanism: critical+ offense bans from arbitration for N epochs
- Double jeopardy protection: same event cannot trigger same penalty twice
- Recovery path: after ban expires, node starts at scar-limited maximum
Effort: M

P2.3 — Experience Tokens

P2.3.1 — Token Levels and Minting

Depends on: P2.1.1
Input: λ-reputation.md (Experience Tokens section), implementation guide
Output: src/domains/reputation/tokens.{ext}, tests/domains/reputation/tokens.test.{ext}
Acceptance criteria:
- 5 levels: L0 (Raw), L1 (Episode), L1.5 (Witness), L2a (Correlation), L2b (Proto-causal)
- L0: auto-minted on event completion
- L1: requires interaction cycle (commit → deliver → confirm)
- L1.5: requires witnessing 3+ disputes as observer
- L2a: requires 5+ repetitions of same interaction pattern
- L2b: requires L2a + context diversity (3+ categories) + path diversity (3+ counterparties)
- Tokens are non-transferable, bound to node identity
- Token count per node queryable by domain
Effort: L

P2.4 — Derived Limits

P2.4.1 — Capability Gates

Depends on: P2.1.2, P1.3.2
Input: λ-reputation.md (Derived Limits section)
Output: src/domains/reputation/limits.{ext}, tests/domains/reputation/limits.test.{ext}
Acceptance criteria:
- max_parallel_tasks(rep) = min(sqrt_floor(rep), 20)
- rate_limit_bonus(rep) = base_rate * log2_floor(max(rep, 1))
- stake_discount(rep) = required_stake * 10000 / max(rep, 1000) (bps math)
- can_arbitrate(rep) = rep.arbitration >= 5000 AND rep.execution >= 3000
- can_govern(rep) = rep.governance >= 4000
- All computations use integer-math library
Effort: M

P2.5 — MCP Tool Surface

P2.5.1 — Reputation Query Tools

Depends on: P2.1.2, P2.2.1, P2.2.2, P2.3.1, P2.4.1, P0.3.4
Input: λ-reputation.md, docs/guides/implementation/lambda-reputation.md
Output: src/domains/reputation/tools.{ext}, tests/domains/reputation/tools.test.{ext}
Acceptance criteria:
- reputation_get(node_id, domain?) → score, scars, ban_until, last_activity per domain
- reputation_history(node_id, domain, limit) → paginated history events
- reputation_leaderboard(domain, limit) → top N nodes by score
- reputation_check_gates(node_id) → capability gate results (can_arbitrate, can_govern, max_parallel_tasks)
- All tools registered as MCP tools via tool registry (ε)
- Integration test: create node → apply events → verify score matches hand-calculation
Effort: S

Phase 3: θ Consensus

P3.1 — BFT Voting

P3.1.1 — Vote Message Types

Depends on: P1.4.1
Input: docs/concepts/θ-consensus.md, docs/guides/implementation/theta-consensus.md
Output: src/consensus/messages.{ext}, tests/consensus/messages.test.{ext}
Acceptance criteria:
- Message types: PROPOSE, VOTE, COMMIT, VIEW_CHANGE, CHECKPOINT
- Vote types: ACCEPT, REJECT, ABSTAIN
- All messages signed with Ed25519
- Message fields: sender, type, round, payload, signature, timestamp
- Serialization: canonical JSON (deterministic key order)
- Deserialization validates all required fields
Effort: M

P3.1.2 — Quorum Computation

Depends on: P3.1.1
Input: θ-consensus.md (quorum math), BFT extraction
Output: src/consensus/quorum.{ext}, tests/consensus/quorum.test.{ext}
Acceptance criteria:
- quorum_threshold(n) = floor(2 * n / 3) + 1
- max_faulty(n) = floor((n - 1) / 3)
- has_quorum(votes, n) = count(votes.accept) >= quorum_threshold(n)
- Equivocation detection: same node signs contradicting votes → generate proof
- Proof format: {node_id, vote_1, vote_2, round} — cryptographic evidence
- Property test: for n >= 4, quorum of honest nodes always overlaps
Effort: M

P3.1.3 — View Change Protocol

Depends on: P3.1.2
Input: θ-consensus.md (view change section), theta extraction
Output: src/consensus/view-change.{ext}, tests/consensus/view-change.test.{ext}
Acceptance criteria:
- Trigger: primary unresponsive for 2× expected round duration
- New primary selection: deterministic rotation (round % n)
- View change message carries highest committed state
- New primary must prove it has the latest committed state
- Timeout doubles each failed view change (exponential backoff)
- Anti-thrashing: minimum 3 rounds before another view change
Effort: L

P3.2 — Finality Levels

P3.2.1 — Finality State Machine

Depends on: P3.1.2
Input: θ-consensus.md (Finality Levels), implementation guide
Output: src/consensus/finality.{ext}, tests/consensus/finality.test.{ext}
Acceptance criteria:
- States: PENDING → SOFT → QUORUM → HARD → ABSOLUTE
- PENDING → SOFT: first vote received
- SOFT → QUORUM: votes >= quorum_threshold(n)
- QUORUM → HARD: dispute window (100 epochs) elapsed without challenge
- HARD → ABSOLUTE: appeal window elapsed, fully irreversible
- No external side effects (payments, exports) before HARD
- State transitions are monotonic: cannot go backward
- Each transition recorded with epoch and evidence
Effort: L

P3.3 — Gossip Protocol

P3.3.1 — IHAVE/IWANT Messages

Depends on: P3.1.1
Input: θ-consensus.md (Gossip), implementation guide, theta extraction
Output: src/consensus/gossip.{ext}, tests/consensus/gossip.test.{ext}
Acceptance criteria:
- IHAVE: {event_ids[], state_root, rule_version, fork_id}
- IWANT: {event_ids[]} — request specific events
- Bloom filter for deduplication (false positive rate < 1%)
- Adaptive fanout: well-connected nodes gossip to fewer peers
- Triple-Anchor validation: reject messages where rule_version, state_root, or fork_id don’t match
- Bandwidth budget: max N bytes/second per peer connection
Effort: L

P3.4 — Time Anchors

P3.4.1 — Signed Timestamps

Depends on: P3.1.1, P2.1.1
Input: θ-consensus.md (Time Anchors), implementation guide
Output: src/consensus/time-anchors.{ext}, tests/consensus/time-anchors.test.{ext}
Acceptance criteria:
- Eligible publishers: top N arbiters by arbitration reputation
- Anchor format: {publisher, timestamp_ms, epoch, signature}
- Median computation: collect anchors from last K epochs, take median
- Drift detection: |local_clock - median| > 30_000ms → deprioritize proposals
- Monotonicity: anchors from same publisher must be non-decreasing
- Replay protection: anchors with epoch < current_epoch - 10 are rejected
Effort: M

P3.5 — Slashing

P3.5.1 — Equivocation Enforcement

Depends on: P3.1.2, P2.2.2
Input: θ-consensus.md (Slashing Conditions), theta extraction, ADR-003
Output: src/consensus/slashing.{ext}, tests/consensus/slashing.test.{ext}
Acceptance criteria:
- apply_equivocation_slash(proof) → calls reputation penalty (P2.2.2) for double-signing node
- Proof verification: check both contradicting votes carry valid signatures from same node
- Slash amount: maps to critical offense (8000bps loss) in reputation penalty table
- Idempotent: same equivocation proof applied twice must not slash twice (proof hash dedup)
- Slashing recorded in reputation history table with event_id = proof hash
- Integration test: create equivocation → verify slash applied → verify idempotency
Effort: M

Phase 4: μ Integrity Monitor

P4.1.1 — Advisory Record Schema + Envelope

Depends on: P1.5.4 (κ canonical serializer), P1.5.1 (κ version hash)
Input: docs/3-world/physics/enforcement/integrity.md §Advisory record schema (L129-146), docs/spec/s14-integrity-monitor.md §Output
Output: src/domains/integrity/schema.ts, src/domains/integrity/__tests__/schema.test.ts
Acceptance criteria:
- Zod schema for the 8-field envelope: role, check, result, severity, evidence, recommendation, decision_hash, timestamp_logical
- role: "Translator" | "Sentinel" | "Guide"
- check: "circular_logic" | "coercion_trap" | "axiom_drift" | "axiom_regression"
- result: "PASS" | "WARN" | "BLOCK"
- severity: "LOW" | "MED" | "HIGH" (matches λ severity bands; supersedes legacy INFO/WARNING/CRITICAL per R91 audit Q6)
- decision_hash: SHA-256(role || check || canonical(input) || result) per R91 audit Q7 (uses κ P1.5.4 canonical serializer)
- timestamp_logical: uint64 Lamport clock (not wall-clock; inherits θ design invariant)
- Dedup: identical inputs produce identical advisories; decision_hash is the unique key
Effort: S

P4.2.1 — Circular Logic Detector (DFS)

Depends on: P4.1.1, P0.7.1 (ζ thought_records substrate), P1.2.4 (κ registry for rule-dep edges)
Input: docs/3-world/physics/enforcement/integrity.md §1 Circular logic (L22-55) — DFS pseudocode
Output: src/domains/integrity/detectors/circular.ts, src/domains/integrity/detectors/__tests__/circular.test.ts
Acceptance criteria:
- DFS cycle detection on thought_records.cites graph; edge source = parent_hash OR refs[]
- find_cycles(records): Cycle[] returns all cycles (not just first)
- IN_PROGRESS / DONE coloring; correctly handles diamonds (DAG, no cycle) vs. true back-edges (cycle)
- Cross-rule cycle support: rule A depends on rule B which depends on rule A via different parameters (uses κ P1.2.4 rule-dep edges)
- Threshold: any cycle → emit advisory with severity=HIGH, result=WARN
- FP profile: <1% on test corpus; verified in P4.7.1
Effort: M

P4.2.2 — Coercion Trap Detector (option-set)

Depends on: P4.1.1, P1.4.1 (κ admission evaluator), P1.3.1 (κ rule engine), P2.1.2 (λ score compute)
Input: docs/3-world/physics/enforcement/integrity.md §2 Coercion trap (L57-85) — option-set enumeration pseudocode
Output: src/domains/integrity/detectors/coercion.ts, src/domains/integrity/detectors/__tests__/coercion.test.ts
Acceptance criteria:
- Enumerate all legal actions for a participant given current state (via κ admission evaluator P1.4.1)
- For each action, compute outcome via κ rule engine P1.3.1 (collect reputation_delta, obligation_beyond_capacity)
- Flag if: every available option produces reputation_delta < 0 (uses λ P2.1.2 score-compute signature)
- Flag if: every available option produces obligation_beyond_capacity
- Flag if: action space is empty
- On flag: emit advisory check=coercion_trap, severity=HIGH, evidence=[presented, available, outcomes]
- No veto power: advisory only; cannot block the decision
Effort: M

P4.2.3 — Axiom Drift Tracker (sliding window)

Depends on: P4.1.1, P2.2.2 (λ penalties — parameter-change events surface), P1.1.1 (κ BPS arith)
Input: docs/3-world/physics/enforcement/integrity.md §3 Axiom drift (L87-115) — sliding-window pseudocode + AX-01..AX-07 regression check
Output: src/domains/integrity/detectors/drift.ts, src/domains/integrity/detectors/__tests__/drift.test.ts
Acceptance criteria:
- Sliding-window aggregation: 6-month window over parameter-change events per domain
- Sum abs(c.delta_bps) across the window for each domain
- Threshold 1: ≥800 bps (8%) → emit advisory severity=MED, result=WARN
- Threshold 2: ≥1000 bps (10%, AX-06 cap) → emit advisory severity=HIGH, result=BLOCK (denies new proposals in domain)
- AX-invariant regression check: for each staged proposal, simulate against AX-01..AX-07; if any invariant would regress → emit check=axiom_regression, severity=HIGH, result=HARD BLOCK
- BPS arithmetic via P1.1.1 (no floats; integer-only)
- FP profile: high without long history; verified in P4.7.1 with synthetic 12-month corpus
Effort: L

P4.3.1 — Three Advisory Roles (Translator/Sentinel/Guide)

Depends on: P4.1.1
Input: docs/3-world/physics/enforcement/integrity.md §Three advisory roles (L117-127), docs/spec/s14-integrity-monitor.md §Advisory roles
Output: src/domains/integrity/roles.ts, src/domains/integrity/__tests__/roles.test.ts
Acceptance criteria:
- Translator: read-only; summarize advisory reports for a human operator; no recommendations of its own (per integrity.md L123)
- Sentinel: read-only; flag advisory reports that meet a severity threshold; may escalate to π (per integrity.md L124)
- Guide: read-only; suggest corrective actions for human review; the human decides whether to act (per integrity.md L125)
- All three roles are strictly read-only — no mutation API exposed
- Standard output format across all roles (the P4.1.1 envelope)
- No “Mutator” role exists (per integrity.md §Three advisory roles L127)
Effort: M

P4.4.1 — Escalation FSM (4-result + 3 invariant mappings)

Depends on: P4.1.1, P4.2.1, P4.2.2, P4.2.3, P1.4.1 (κ admission), P1.2.4 (κ registry)
Input: docs/3-world/physics/enforcement/integrity.md §Escalation mapping (L148-157), docs/spec/s14-integrity-monitor.md §When advisory becomes enforcement
Output: src/domains/integrity/escalation.ts, src/domains/integrity/__tests__/escalation.test.ts
Acceptance criteria:
- 4-result FSM: PASS / WARN / BLOCK / HARD BLOCK
- PASS → log to ζ at thought_type=advisory; no further effect
- WARN → log + surface in operator console; no rule change
- BLOCK → record denial event into ζ at thought_type=advisory; π integration is out of scope for Phase 4 (per R91 audit Q5)
- HARD BLOCK → α tool-lock admission denies; downstream κ evaluation never runs
- Invariant mapping 1: circular-logic-in-rule-update → rule rejected at κ rule loader P1.2.4
- Invariant mapping 2: coercion-in-admission → event rejected at κ admission P1.4.1
- Invariant mapping 3: axiom-drift-beyond-limits → governance proposal rejected (records BLOCK event into ζ; π consumes when π ships)
Effort: M

P4.5.1 — Advisory Persistence (`mcp_advisories` migration)

Depends on: P4.1.1, P0.2.2 (SQLite migration runner)
Input: docs/3-world/physics/enforcement/integrity.md §Phase 0 posture (L165-170) — names the mcp_advisories table stub; Phase 4 activates it
Output: src/db/migrations/*-advisories.sql, src/domains/integrity/repository.ts, src/domains/integrity/__tests__/repository.test.ts
Acceptance criteria:
- SQLite migration creates mcp_advisories table with the 8 envelope fields from P4.1.1
- decision_hash TEXT NOT NULL UNIQUE enforces dedup
- timestamp_logical INTEGER NOT NULL (uint64 Lamport)
- Indexes on (check, severity) and (role) for typical advisory queries
- Repository: insertAdvisory(record), getAdvisory(decision_hash), listAdvisories(filter)
- Idempotent insert: same decision_hash returns existing row, does not throw
Effort: S

P4.6.1 — μ MCP Tool Surface (≥4 tools)

Depends on: P4.3.1, P4.4.1, P4.5.1
Input: docs/3-world/physics/enforcement/integrity.md §Phase 4 scope (L172-178), MCP tool registration pattern in existing β/ε/ζ/η/λ/θ surfaces
Output: src/tools/integrity.ts, src/tools/__tests__/integrity.test.ts
Acceptance criteria:
- ≥4 MCP tools registered, growing the surface 23 → 27+
- integrity_check_circular: trigger D1 detector over current thought_records; returns advisory list
- integrity_check_coercion: trigger D2 detector over a decision_record; returns advisory or PASS
- integrity_check_drift: trigger D3 detector for a domain; returns advisory or PASS
- integrity_query: list / fetch advisories from mcp_advisories (uses P4.5.1 repository)
- All tools use Zod v3.23 schemas + registerTool pattern as in existing θ/λ surfaces
- Each tool’s response shape matches the P4.1.1 envelope
Effort: M

P4.7.1 — Test Corpus + Parity Harness

Depends on: P4.2.1, P4.2.2, P4.2.3, P4.4.1, P4.5.1
Input: Precedent: κ P1.5.5 parity harness (src/domains/rules/__tests__/parity-harness.test.ts, R87 #214); θ P3.8.1 4-scenario harness (R89 Phase B #246)
Output: src/__tests__/integrity/parity.test.ts, fixture corpus in src/__tests__/integrity/corpus/
Acceptance criteria:
- D1 cycle-detection corpus: ≥10 fixtures spanning true cycles, diamonds (no cycle), self-loops, length-2 cycles, length-N cycles
- D2 coercion corpus: ≥10 fixtures spanning all-negative, empty-set, mixed-outcome (no flag), reputation-loss-only, obligation-only
- D3 drift corpus: ≥10 fixtures spanning under-threshold, 8% WARN, 10% BLOCK, AX-01..AX-07 regression cases
- FP rate measurement: D1 <1%, D2 <5%, D3 <10% on the corpus (FP rates per integrity.md §1, §2, §3 declared profiles)
- Parity harness: run each detector against its corpus; assert advisory output matches expected envelope byte-for-byte
Effort: L

P4.8.1 — Fork Hook Subscriber (post-fork invariant sweep)

Depends on: P4.2.3, P3.9.1 (θ ForkHookRegistry)
Input: docs/3-world/physics/enforcement/integrity.md §Phase 4 scope (L177 — fork-hook awareness), docs/3-world/physics/laws/consensus.md (θ fork-hook surface)
Output: src/domains/integrity/fork-hook-subscriber.ts, src/domains/integrity/__tests__/fork-hook-subscriber.test.ts
Acceptance criteria:
- Subscribes to θ ForkHookRegistry (from P3.9.1) for POST_FORK events
- On POST_FORK event: triggers a D3 axiom-drift sweep across all domains in the forked sub-tree
- Records sweep results to mcp_advisories via P4.5.1 repository
- Idempotent: same fork event triggers at most one sweep (uses fork event id as dedup key)
- Sweep is bounded: caps detectors at a configurable budget (default 100 advisories per fork event)
- Stages the surface for ι Phase 5 (state-fork) activation; ι integration out of scope for Phase 4
Effort: S

Phase 5: ι Fork Protocol

P5.1.1 — Fork ID and Creation

Depends on: P3.1.2, P1.4.1
Input: docs/concepts/ι-state-fork.md, theta extraction (fork sections)
Output: src/domains/fork/index.{ext}, tests/domains/fork/index.test.{ext}

Acceptance criteria:

[ ] Fork ID = SHA-256(parent_fork_id

divergence_event_id

rule_hash

reason)

Auto triggers: rule conflict, invariant violation, constitutional violation
Manual triggers: voluntary exit, governance rejection
Isolation modes: ISOLATED (no data flow), READ_ONLY_PARENT (read parent, can’t write), BRIDGED (selective sync)
Fork-scoped state: event log, reputation, tokens, BFT state — all copied at fork point

Effort: XL

P5.2.1 — Checkpoint Protocol

Depends on: P5.1.1
Input: ι-state-fork.md (checkpoint section)
Output: src/domains/fork/checkpoints.{ext}, tests/domains/fork/checkpoints.test.{ext}
Acceptance criteria:
- Frequency: every 1000 events OR 100 epochs (whichever first)
- Signers: top 10 arbiters by reputation, threshold 7/10
- Content: fork_id, epoch, event_count, state_root, reputation_snapshot, rule_version_hash
- Fast sync: new nodes download checkpoint + post-checkpoint events
- Checkpoint chain: each checkpoint references previous checkpoint hash
Effort: L

P5.3.1 — Fork Merge

Depends on: P5.1.1, P5.2.1
Input: ι-state-fork.md (merge section), theta extraction (fork merge)
Output: src/domains/fork/merge.{ext}, tests/domains/fork/merge.test.{ext}
Acceptance criteria:
- Find common ancestor fork point
- Compute event diff between forks
- Conflict detection: same state key modified in both forks
- Resolution strategies: timestamp ordering (default), reputation-weighted voting, governance vote
- Rule conflicts: cannot auto-merge, require governance vote
- Reputation discount on transition: 50% of source fork reputation (configurable)
- Both forks must reach quorum agreement for merge to finalize
Effort: XL

Phase 6: π Governance

P6.1.1 — Proposal Lifecycle

Depends on: P1.3.1, P2.1.1, P3.1.2
Output: src/domains/governance/proposals.{ext}, tests/domains/governance/proposals.test.{ext}
Acceptance criteria:
- Proposal types: AX (constitutional), PR (protected rule), GOV (protocol rule)
- Voting: AX/PR require >80% supermajority, GOV requires >66% quorum
- AX changes require 3-stage time-locked votes (30-day intervals)
- Automatic activation at activation_epoch after vote passes
- Appeal mechanism with cooldown period
Effort: XL

P6.2.1 — Governance Limits

Depends on: P6.1.1
Input: MASTER-TASKS.md P6.2 section
Output: src/domains/governance/limits.{ext}, tests/domains/governance/limits.test.{ext}
Acceptance criteria:
- Max delta: ±10% per 6 months for any numeric parameter
- Constitutional pegs: ±30% from genesis requires 3-stage supermajority
- Cooldown: 1 epoch between changes to same parameter
- Stability: max 2 parameters per domain changed simultaneously
- Entropy injection: for >5% delta, 10% of votes (VRF-selected) count as equal-weight
Effort: L

P6.3.1 — Axiom Enforcement

Depends on: P6.1.1, P4.4.1
Input: promises.md (system guarantees), S01 spec (7 constitutional axioms)
Output: src/governance/axiom-enforcement.{ext}, tests/governance/axiom-enforcement.test.{ext}
Acceptance criteria:
- AX-01 (Append-only): all delete operations rejected; corrections via new events
- AX-02 (Derived reputation): no admin reset; only rule engine can change reputation
- AX-03 (No absolute authority): all roles subject to consequences
- AX-04 (Consequence windows): sanction intent → admission → voluntary → automated
- AX-05 (Subjective finality): per local rule engine, not global consensus
- AX-06 (Right to exit): fork allowed; penalty capped at 10%
- AX-07 (Technical sovereignty): row-level security, no cross-workspace leakage
- Each axiom has a guard function: check_axiom_N(proposed_action) → {pass, violation_details}
Effort: L

Phase 7: ξ Identity — Digital Soul Vector

Status: Spec-only. No implementation tasks until Phase 6 is complete. Concept: docs/3-world/social/identity.md Extraction: docs/reference/extractions/xi-identity-extraction.md

Phase 7 implements the Digital Soul Vector — a persistent identity fabric for agents and nodes.

P7.1.1 — Identity Schema

Depends on: P0.2.2, P2.1.1, P3.1.2
Input: docs/concepts/ξ-identity.md, xi-identity-extraction.md
Output: src/domains/identity/schema.ts, tests/domains/identity/schema.test.ts
Acceptance criteria:
- 8 identity domains: contribution, governance, reputation, behavior, skills, relationships, history, sovereignty
- 7 character traits (immutable at genesis): curiosity, reliability, fairness, courage, wisdom, creativity, empathy
- Identity hash: SHA-256 of canonical genesis record (immutable after creation)
- Soul-Bound Token (SBT) concept: identity cannot be transferred or sold
- Schema stored in identities table with Ed25519 public key as primary identifier
Effort: L

P7.1.2 — Identity Binding (VRF + Ed25519)

Depends on: P7.1.1, ADR-002 decision
Input: xi-identity-extraction.md (binding section), ADR-002-vrf-implementation.md
Output: src/domains/identity/binding.ts, tests/domains/identity/binding.test.ts
Acceptance criteria:
- Ed25519 keypair generation: generateIdentityKeyPair() → { publicKey, privateKey }
- Identity proof: proveIdentity(privateKey, challenge) → Ed25519 signature
- VRF integration: generateUnpredictableEntropy(privateKey, epoch) → VRF output
- Binding is permanent: once public key registered, cannot be reassigned
- Test: sign + verify roundtrip; VRF output deterministic for same inputs
Effort: L

P7.1.3 — Soul Vector Accumulation

Depends on: P7.1.1, P2.1.1, P3.1.2
Input: xi-identity-extraction.md (accumulation section)
Output: src/domains/identity/accumulator.ts, tests/domains/identity/accumulator.test.ts
Acceptance criteria:
- Each completed task updates identity’s contribution domain
- Each governance vote updates governance domain
- Reputation scores flow from λ into identity’s reputation domain
- getSoulVector(identityId) → 8-domain snapshot at current epoch
- Soul vector is read-only via API; only rule engine can modify it (AX-02 protection)
Effort: L

Task Summary

Phase	Tasks	Effort	Depends on
P0 Bootstrap	28 tasks	4-6 weeks	—
P1 κ Rule Engine	10 tasks	3-4 weeks	P0
P2 λ Reputation	7 tasks	2-3 weeks	P0, P1
P3 θ Consensus	7 tasks	5-6 weeks	P0, P1, P2
P4 μ Integrity	10 tasks	3-4 weeks	P1, P2, P3
P5 ι Fork	3 tasks	3-4 weeks	P0, P3
P6 π Governance	3 tasks	3-4 weeks	All
P7 ξ Identity	3 tasks	3-4 weeks	P0, P2, P3, P6
Total	71 tasks	26-34 weeks

R57 expansion: Phase 0 grew from 9 high-level bullets → 28 granular tasks with acceptance criteria. Total task count: 32 (pre-R57) → 63 tasks (post-R57). Phase 7 ξ Identity defined for the first time.

Dependency Graph (Mermaid)

graph TD
    P1.1.1[P1.1.1 Integer Math] --> P1.1.2[P1.1.2 Determinism Harness]
    P1.2.1[P1.2.1 Lexer] --> P1.2.2[P1.2.2 Parser]
    P1.2.2 --> P1.2.3[P1.2.3 Validator]
    P1.2.2 --> P1.3.1[P1.3.1 Eval Loop]
    P1.1.1 --> P1.3.1
    P1.3.1 --> P1.3.2[P1.3.2 Builtins]
    P1.3.1 --> P1.3.3[P1.3.3 State Access]
    P1.2.2 --> P1.4.1[P1.4.1 Version Hash]
    P1.4.1 --> P1.4.2[P1.4.2 Migration]
    P1.3.1 --> P1.4.2

    P1.1.1 --> P2.1.1[P2.1.1 Rep Schema]
    P2.1.1 --> P2.1.2[P2.1.2 Score Compute]
    P1.3.1 --> P2.1.2
    P2.1.1 --> P2.2.1[P2.2.1 Decay]
    P1.1.1 --> P2.2.1
    P2.1.1 --> P2.2.2[P2.2.2 Penalties]
    P2.1.1 --> P2.3.1[P2.3.1 Tokens]
    P2.1.2 --> P2.4.1[P2.4.1 Limits]
    P1.3.2 --> P2.4.1

    P1.4.1 --> P3.1.1[P3.1.1 Vote Messages]
    P3.1.1 --> P3.1.2[P3.1.2 Quorum]
    P3.1.2 --> P3.1.3[P3.1.3 View Change]
    P3.1.2 --> P3.2.1[P3.2.1 Finality SM]
    P3.1.1 --> P3.3.1[P3.3.1 Gossip]
    P3.1.1 --> P3.4.1[P3.4.1 Time Anchors]
    P2.1.1 --> P3.4.1

    P1.5.4 --> P4.1.1[P4.1.1 Envelope]
    P1.5.1 --> P4.1.1
    P4.1.1 --> P4.2.1[P4.2.1 Circular]
    P4.1.1 --> P4.2.2[P4.2.2 Coercion]
    P4.1.1 --> P4.2.3[P4.2.3 Drift]
    P0.7.1 --> P4.2.1
    P1.2.4 --> P4.2.1
    P1.4.1 --> P4.2.2
    P1.3.1 --> P4.2.2
    P2.1.2 --> P4.2.2
    P2.2.2 --> P4.2.3
    P1.1.1 --> P4.2.3
    P4.1.1 --> P4.3.1[P4.3.1 Roles]
    P4.2.1 --> P4.4.1[P4.4.1 Escalation]
    P4.2.2 --> P4.4.1
    P4.2.3 --> P4.4.1
    P1.4.1 --> P4.4.1
    P1.2.4 --> P4.4.1
    P4.1.1 --> P4.5.1[P4.5.1 Persistence]
    P4.3.1 --> P4.6.1[P4.6.1 MCP Tools]
    P4.4.1 --> P4.6.1
    P4.5.1 --> P4.6.1
    P4.2.1 --> P4.7.1[P4.7.1 Parity]
    P4.2.2 --> P4.7.1
    P4.2.3 --> P4.7.1
    P4.4.1 --> P4.7.1
    P4.5.1 --> P4.7.1
    P4.2.3 --> P4.8.1[P4.8.1 Fork Hook]
    P3.9.1 --> P4.8.1

    P3.1.2 --> P5.1.1[P5.1.1 Fork Create]
    P1.4.1 --> P5.1.1
    P5.1.1 --> P5.2.1[P5.2.1 Checkpoints]
    P5.1.1 --> P5.3.1[P5.3.1 Fork Merge]
    P5.2.1 --> P5.3.1

    P1.3.1 --> P6.1.1[P6.1.1 Proposals]
    P2.1.1 --> P6.1.1
    P3.1.2 --> P6.1.1
    P6.1.1 --> P6.2.1[P6.2.1 Gov Limits]
    P6.1.1 --> P6.3.1[P6.3.1 Axiom Guards]
    P4.4.1 --> P6.3.1

Agent Execution Protocol

When an AI agent picks up a task from this list:

Read the Input files listed for the task
Create the Output files at the specified paths (adjust extension for chosen stack)
Run all acceptance criteria as tests
Record completion via task_update and thought_record
Do not start a task whose dependencies are incomplete
Do not use floating-point arithmetic in any κ/λ/θ computation path
Do not introduce non-deterministic operations (random, clock, I/O) in rule evaluation