Colibri — Deep Task Breakdown for Agent Execution
Purpose: Fine-grained task definitions for AI agent execution (all axes).
Source: docs/colibri-system.md, concept docs (α–π), donor extractions (heritage-only).
Stack: TypeScript 5.3+ · @modelcontextprotocol/sdk · Zod 4 · better-sqlite3 · Jest (ESM).
Quick Start for agents: This file describes what to do. For how to do it (copy-paste ready prompts), see
task-prompts/— each Phase 0 group (P0.1 through P0.9) has a corresponding prompt file. For the dependency DAG and critical path, seetask-dependency-graph.md. For the round-by-round schedule (R74 → R80), see../../5-time/roadmap.md.
Phase 0 shape
Phase 0 is 28 sub-tasks numbered P0.1.1 – P0.9.3. The numbering is locked; no slice may introduce new sub-tasks into this range. Numbering:
| Group | Sub-tasks | Concept |
|---|---|---|
| P0.1 | P0.1.1 – P0.1.4 | Project infrastructure |
| P0.2 | P0.2.1 – P0.2.4 | α System Core |
| P0.3 | P0.3.1 – P0.3.4 | β Task Pipeline |
| P0.4 | P0.4.1 – P0.4.2 | γ Server Lifecycle |
| P0.5 | P0.5.1 – P0.5.2 | δ Model Router (Phase 0 stubs; full routing → Phase 1.5) |
| P0.6 | P0.6.1 – P0.6.3 | ε Skill Registry |
| P0.7 | P0.7.1 – P0.7.3 | ζ Decision Trail |
| P0.8 | P0.8.1 – P0.8.3 | η Proof Store |
| P0.9 | P0.9.1 – P0.9.3 | ν Integrations |
Each sub-task has an acceptance criteria list, an effort estimate, and explicit dependencies. Every sub-task lists target files — none of them exist yet. colibri_code: none holds for all 15 Greek concepts.
Heritage note: Donor algorithm pseudocode for every concept is in
docs/reference/extractions/. Those files are references, not copy sources. Colibri is a full rewrite; code is earned by the 5-step executor chain (audit → contract → packet → implement → verify), not transcribed.
Phase 0: Colibri Bootstrap (Execution + Intelligence Axis)
These tasks implement the Execution and Intelligence axes from scratch (TypeScript rewrite):
P0.1 — Project Infrastructure
P0.1.1 — Package Setup
- Depends on: nothing
- Output:
package.json,tsconfig.json,.eslintrc.json,.prettierrc,.env.example - Acceptance criteria:
package.json:"type": "module","engines": {"node": ">=20"}, ESM-first- TypeScript 5.3+:
strict: true,target: ES2022,module: NodeNext tsxfor dev;tscfor production build.env.exampledocuments the Phase 0COLIBRI_*floor (COLIBRI_DB_PATH,COLIBRI_LOG_LEVEL,COLIBRI_STARTUP_TIMEOUT_MS). Additional variables are earned by later concepts; noAMS_*variables..gitignoreexcludesnode_modules/,dist/,.env,data/colibri.db,data/ams.db
- Effort: S
P0.1.2 — Test Runner + Linter
- Depends on: P0.1.1
- Input: nothing
- Output:
jest.config.ts,eslint.config.ts,src/__tests__/smoke.test.ts - Acceptance criteria:
npm testruns Jest with ESM transformnpm run lintruns ESLint with zero errors on empty codebasenpm run buildcompiles TypeScript todist/with no errors- Smoke test:
smoke.test.tsasserts1 + 1 === 2(verifies test harness works) - Code coverage report generated (
--coverageflag)
- Effort: S
P0.1.3 — CI Pipeline
- Depends on: P0.1.2
- Input:
.github/workflows/ci.yml(existing; may need update) - Output:
.github/workflows/ci.ymlupdated for TypeScript - Acceptance criteria:
- Runs on push to any branch and on PR to
main - Steps:
npm ci→npm run lint→npm test→npm run build - Node.js 20+ matrix
- Fails if any step fails
- Uploads coverage report artifact
- Runs on push to any branch and on PR to
- Effort: S
P0.1.4 — Environment Validation
- Depends on: P0.1.1
- Output:
src/config.ts,tests/config.test.ts - Acceptance criteria:
- Zod schema validates the Phase 0
COLIBRI_*floor on startup - Missing required var → throws with human-readable message listing the missing key
- Optional vars have typed defaults (
COLIBRI_LOG_LEVEL=info,COLIBRI_STARTUP_TIMEOUT_MS=30000) COLIBRI_LOG_LEVELaccepted values:silent | error | warn | info | debugNODE_ENVaccepted values:development | test | production- Export
configobject (typed, not rawprocess.env) - Reading any
AMS_*variable is a lint/test failure — the donor namespace is not supported
- Zod schema validates the Phase 0
- Effort: S
P0.2 — α System Core
P0.2.1 — MCP Server Bootstrap
- Depends on: P0.1.2
- Output:
src/server.ts,tests/server.test.ts - Acceptance criteria:
McpServercreated withname: "colibri",versionfrompackage.jsonStdioServerTransportis the only transport in Phase 0 per S17. No HTTP, no WebSocket.- Server exports
registerTool(name, schema, handler)helper that composes the five-stage α middleware chain (tool-lock → schema validate → audit enter → dispatch → audit exit) - At least 1 registered tool:
server/ping→ returns{ status: "ok", version } npm testpasses with MCP handshake integration test
- Effort: M
P0.2.2 — SQLite Initialization
- Depends on: P0.1.4
- Input:
docs/architecture/data-model.md§2 (earning rule),docs/reference/extractions/alpha-system-core-extraction.md(donor pseudocode, heritage-only) - Output:
src/db/index.ts,src/db/schema.sql,tests/db/init.test.ts - Acceptance criteria:
- Uses
better-sqlite3(sync API) schema.sqlships only an empty header + the first migration slot. Tables are added by their owning concept’s P0 sub-task perdocs/architecture/data-model.md§2 (β:tasks,task_transitions; ε:skills; ζ:thoughts,actions; η:merkle_nodes,merkle_roots; ν:sync_log). No “78 tables” target.initDb(path)function: creates DB if not exists, applies all numbered migrations in order, returnsDatabaseinstance- Idempotent: calling
initDb()twice does not fail or duplicate data - WAL mode enabled:
PRAGMA journal_mode=WAL - Foreign keys enabled:
PRAGMA foreign_keys=ON PRAGMA integrity_checkruns at startup and fails boot on any error- Test: fresh DB passes integrity check
- Uses
- Effort: L
P0.2.3 — Two-Phase Startup
- Depends on: P0.2.1, P0.2.2
- Input: alpha-system-core-extraction.md (startup section), gamma-server-lifecycle-extraction.md
- Output:
src/startup.ts,tests/startup.test.ts - Acceptance criteria:
- Phase 1 (transport): MCP transport ready, health check responds, DB not yet loaded
- Phase 2 (heavy init): DB initialized, all tools registered, all domains loaded
startup()returns only after Phase 2 completes- If Phase 2 fails, server shuts down gracefully (no hanging process)
- Startup time logged:
console.error("Startup complete in {ms}ms") - Test: mock Phase 2 failure → verify clean shutdown
- Effort: M
P0.2.4 — Health Check Tool
- Depends on: P0.2.3
- Input: nothing
- Output:
src/tools/health.ts,tests/tools/health.test.ts - Acceptance criteria:
- Tool name:
server/health - Returns:
{ status, version, uptime_ms, db_tables, phase, mode } db_tables: count of SQLite tables (verifies schema loaded correctly)-
[ ] phase:"phase1""phase2" mode: current runtime mode string- Response time < 100ms
- Tool name:
- Effort: S
P0.3 — β Task Pipeline
P0.3.1 — β Task Pipeline State Machine
- Depends on: P0.2.2
- Input:
docs/colibri-system.md§6.3 (canonical FSM),docs/concepts/β-task-pipeline.md - Output:
src/domains/tasks/state-machine.ts,tests/domains/tasks/state-machine.test.ts - Acceptance criteria:
- 7 states defined exactly as in
colibri-system.md§6.3:INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE, withCANCELLEDas a terminal side-branch reachable from any non-terminal state. - Transition map matches the canonical diagram. Unlisted transitions throw
InvalidTransitionErrorwith{from, to, taskId}. transition(task, newState)→ returns updated task or throwscanTransition(from, to)→ boolean (no side effects)DONEandCANCELLEDare terminal; any transition out of them throws- 100% branch coverage (all valid transitions, all invalid transitions, both terminal exits)
- 7 states defined exactly as in
- Effort: S
Heritage note: The AMS donor task store used a kanban-style lifecycle (
backlog | todo | in_progress | blocked | review | done | cancelled). That vocabulary survives at the PM-facing level (see CLAUDE.md §5 — “Onlytodotasks are executable”) whiledata/ams.dbremains the task store during Phase 0 bootstrap. The β execution FSM inside Colibri is the canonical INIT..DONE pipeline above, not the donor lifecycle. Mapping between the two belongs to ν Integrations, not β.
P0.3.2 — Task CRUD
- Depends on: P0.3.1
- Input: beta-task-pipeline-extraction.md (CRUD section)
- Output:
src/domains/tasks/repository.ts,tests/domains/tasks/repository.test.ts - Acceptance criteria:
createTask(input): inserts intotaskstable, returns task with generatedid(UUID v4)getTask(id): returns task ornullupdateTask(id, patch): partial update, returns updated taskdeleteTask(id): soft delete (setsdeleted_at)listTasks({ status?, project_id?, limit?, offset? }): filtered + paginated- All operations use
better-sqlite3prepared statements (no string interpolation) - Test: CRUD roundtrip with all fields
- Effort: M
P0.3.3 — Writeback Contract Enforcement
- Depends on: P0.3.2
- Input: beta-task-pipeline-extraction.md (writeback section)
- Output:
src/domains/tasks/writeback.ts,tests/domains/tasks/writeback.test.ts - Acceptance criteria:
writebackRequired(taskId): returns true if task isdonebut lacksthought_recordenforceWriteback(taskId): throwsWritebackRequiredErrorif writeback not complete- Runtime blocking: any tool that moves task to
doneMUST callenforceWritebackbefore returning WritebackRequiredErrorincludestaskId,missing_fields[]- Test: marking task
donewithout thought_record → error; with thought_record → success
- Effort: S
P0.3.4 — Task Tools (MCP surface)
- Depends on: P0.3.3, P0.2.1
- Input: beta-task-pipeline-extraction.md (tools section)
- Output:
src/tools/tasks.ts,tests/tools/tasks.test.ts - Acceptance criteria:
task_createtool: Zod input schema, callscreateTask, returns tasktask_gettool: returns task or{ error: "not_found" }task_updatetool: partial update; validates new state via FSMtask_listtool: supportsstatus,limit,offsetfilterstask_next_actionstool: returns list of unblockedtodotasks sorted by prioritytask_updatewithstatus: "done"triggers writeback enforcement- All tools have Zod input validation (invalid input → structured error, not crash)
- Effort: M
P0.4 — γ Server Lifecycle
P0.4.1 — Runtime Mode Enum
- Depends on: P0.1.4
- Input:
docs/concepts/γ-server-lifecycle.md,docs/reference/extractions/gamma-server-lifecycle-extraction.md(heritage-only) - Output:
src/modes.ts,tests/modes.test.ts - Acceptance criteria:
- 4 modes in Phase 0:
FULL | READONLY | TEST | MINIMAL. (The donorWATCHmode is excluded — file watching is not in Phase 0 α scope.) detectMode(): readsCOLIBRI_MODEenv var, defaults toFULL- Each mode has a capability set:
{ canWrite, canRunTests, heavyInit } READONLY:canWrite=false, all write tools return{ ok: false, error: { code: "ERR_READONLY" } }MINIMAL: no heavy init; onlyunified_vitalsis registered- Test: all 4 modes have correct capability sets
- No
AMS_MODEfallback — donor namespace is not supported
- 4 modes in Phase 0:
- Effort: S
P0.4.2 — Graceful Shutdown
- Depends on: P0.2.3
- Input: gamma-server-lifecycle-extraction.md (shutdown section)
- Output:
src/shutdown.ts,tests/shutdown.test.ts - Acceptance criteria:
registerShutdownHandler(fn): registers a cleanup function- On
SIGINT/SIGTERM: calls all handlers in reverse registration order - DB connection closed before process exit
- In-flight MCP requests allowed to complete (max 5s timeout then force-exit)
- Exit code 0 on clean shutdown, 1 on error during shutdown
- Test: mock SIGTERM → verify DB close + handler called
- Effort: S
P0.5 — δ Model Router
Phase 0 note: Phase 0 library stubs shipped in R75 Wave I per ADR-005 §Decision. The router interface is present (scoring + fallback modules, single-row candidate table); scoring returns a constant (
claude: 1.0), fallback has one member (Claude), adapter is Anthropic-only. Full multi-model scoring, N-member fallback, and circuit breaker land in Phase 1.5. No δ-facing MCP tools in the Phase 0 14-tool surface —router_*tools are deferred to Phase 1.5.
P0.5.1 — Intent Scoring Matrix
- Status: Phase 0 stub shipped (R75 Wave I, PR #149)
- Input:
docs/reference/extractions/delta-model-router-extraction.md(heritage-only — algorithm source) - Output:
src/domains/router/scoring.ts,src/__tests__/domains/router/scoring.test.ts - Phase 0 shipped:
src/domains/router/scoring.tsreturns a constant vector ({ claude: 1.0 }) for every input. The module signature matches the Phase 1.5 target so Phase 1.5 is a formula replacement, not an interface rewrite. - Design contract (Phase 1.5):
scoreIntent(prompt, context)→{ scores: Record<ModelId, number>, winner: ModelId }— Phase 1.5: real scoring factors- Scoring factors: prompt length, complexity keywords, context size, tool requirements — Phase 1.5
- All scores in range
[0, 100](integer) — Phase 1.5 - Deterministic: same input always returns same winner (Phase 0 stub is trivially deterministic — always returns Claude)
- Pure function (no external API calls) — invariant from Phase 0
- Effort: M
P0.5.2 — Model Fallback Chain
- Status: Phase 0 stub shipped (R75 Wave I, PR #150)
- Input:
docs/reference/extractions/delta-model-router-extraction.md(heritage-only — algorithm source) - Output:
src/domains/router/fallback.ts,src/__tests__/domains/router/fallback.test.ts - Phase 0 shipped:
src/domains/router/fallback.tsimplements a single-member chain — if Claude fails, the call fails; no cascade. Matches ADR-005 §Decision (“fallback chain has one member”). - Design contract (Phase 1.5):
- Model slots configured via
COLIBRI_MODEL_*env vars (count TBD by Phase 1.5) — Phase 1.5 routeRequest(prompt, context): tries models in priority order — Phase 1.5 multi-member- On model error / timeout: tries next model in chain — Phase 1.5
- On exhaustion: throws
AllModelsFailedErrorwith per-model error log — Phase 1.5 (Phase 0 raisesModelUnavailableon single-member exhaustion) - Circuit breaker: model marked unavailable for 60s after 3 consecutive failures — Phase 1.5
- No
AMS_MODEL_*fallback — donor namespace is not supported (invariant from Phase 0)
- Model slots configured via
- Effort: M
P0.6 — ε Skill Registry
P0.6.1 — Skill Schema
- Depends on: P0.2.2
- Input: epsilon-skill-registry-extraction.md
- Output:
src/domains/skills/schema.ts,tests/domains/skills/schema.test.ts - Acceptance criteria:
- Zod schema:
{ name, description, version, entrypoint, capabilities[], greekLetter? } namemust be kebab-case:/^[a-z][a-z0-9-]+$/capabilitiesenum:["read", "write", "spawn", "audit", "admin"]greekLetteroptional: must be one of α β γ δ ε ζ η θ ι κ λ μ ν ξ π- SKILL.md parser: reads frontmatter + body from existing
.agents/skills/*/SKILL.md - Test: parse all 22 existing skill files, assert zero schema errors
- Zod schema:
- Effort: M
P0.6.2 — Skill CRUD + Discovery
- Depends on: P0.6.1
- Input:
docs/concepts/ε-skill-registry.md,docs/reference/extractions/epsilon-skill-registry-extraction.md(heritage-only) - Output:
src/domains/skills/repository.ts,tests/domains/skills/repository.test.ts - Acceptance criteria:
- On startup: scans
.agents/skills/*/SKILL.md, parses frontmatter, loads all valid skills into theskillstable getSkill(name)→ skill or nulllistSkills({ search?, capability? })→ filtered listskill_listMCP tool (the only ε Phase 0 MCP tool): returns all loaded skills with frontmatter metadata — see S17 §1 Category 3.skill_get,skill_reload, hot-reload — not in Phase 0. Deferred to Phase 1.
- On startup: scans
- Effort: M
P0.6.3 — Skill Capability Index
- Depends on: P0.6.2
- Input:
docs/concepts/ε-skill-registry.md - Output:
src/domains/skills/capabilities.ts,tests/domains/skills/capabilities.test.ts - Acceptance criteria:
listByCapability(capability)→ skills that declare the capability in frontmatter- Capability strings are treated as opaque tags (Phase 0 does not enforce an enum)
- Startup warning if any SKILL.md declares a capability not used anywhere else (drift detector)
- Test: seed 3 fake skills with overlapping capabilities; verify filter
- Effort: S
Heritage note: The donor ε module supported
skill_get,skill_reload, and aspawnAgentsub-process helper. None of these are in Phase 0. There is nosrc/domains/agents/directory in the Phase 0 target tree (CLAUDE.md §9.1); agent spawning is deferred to Phase 1.5 with δ Model Router per ADR-005. Phase 0 ε ships the SKILL.md parser and one MCP tool (skill_list).
P0.7 — ζ Decision Trail
P0.7.1 — Hash-Chained Record Schema
- Depends on: P0.2.2
- Input: zeta-decision-trail-extraction.md
- Output:
src/domains/trail/schema.ts,tests/domains/trail/schema.test.ts - Acceptance criteria:
- Record schema:
{ id, type, task_id, agent_id, content, timestamp, prev_hash, hash } - 4 valid types:
plan | analysis | decision | reflection hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))- Canonical JSON: sorted keys, no whitespace (deterministic)
- First record:
prev_hash = "0000...0000"(64 zeros) - Test: two records with identical inputs produce identical hashes
- Record schema:
- Effort: S
P0.7.2 — Thought Record CRUD
- Depends on: P0.7.1
- Input: zeta-decision-trail-extraction.md (CRUD section)
- Output:
src/domains/trail/repository.ts,tests/domains/trail/repository.test.ts - Acceptance criteria:
createThoughtRecord(input): computes hash, links to previous record’s hashgetThoughtRecord(id): returns record with hashlistThoughtRecords({ task_id?, limit? }): returns chain in insertion orderthought_recordMCP tool: Zod input, returns record with computed hashthought_record_listMCP tool: returns chain for given task_id
- Effort: M
P0.7.3 — Chain Verification Tool
- Depends on: P0.7.2
- Input: zeta-decision-trail-extraction.md (verification section)
- Output:
src/domains/trail/verifier.ts,tests/domains/trail/verifier.test.ts - Acceptance criteria:
verifyChain(records[]): iterates chain, recomputes each hash, checks links- Returns:
{ valid: bool, first_broken_at?: id, broken_count: number } audit_verify_chainMCP tool: callsverifyChainon full DB chain- Test: tamper with one record’s content → verify
valid: falseat correct position - Test: intact 100-record chain →
valid: truein < 500ms
- Effort: S
P0.8 — η Proof Store
P0.8.1 — Merkle Tree Construction
- Depends on: P0.7.2
- Output:
src/domains/proof/merkle.ts,tests/domains/proof/merkle.test.ts - Acceptance criteria:
- Uses
merkletreejspackage (SHA-256 leaves) buildMerkleTree(recordHashes[])→{ root, tree }generateProof(tree, leafHash)→ proof arrayverifyProof(root, proof, leafHash)→ boolean- Empty tree: root = SHA-256(“”) (defined constant)
- Test: 10-leaf tree → root is deterministic; membership proof verifies correctly
- Uses
- Effort: S
P0.8.2 — Three-Zone Retention
- Depends on: P0.8.1, P0.2.2
- Input: eta-proof-store-extraction.md (retention section)
- Output:
src/domains/proof/retention.ts,tests/domains/proof/retention.test.ts - Acceptance criteria:
- Hot zone: last 100 records — full content in DB
- Warm zone: records 101–1000 — content compressed (JSON → gzip → base64)
- Cold zone: records 1001+ — content hash only (full content deleted)
archiveRecord(id): moves record to next zone based on age/positionretrieveRecord(id): decompresses if Warm, returns hash stub if Cold- Test: hot → warm → cold transitions; verify content availability per zone
- Effort: M
P0.8.3 — Merkle Root Finalization Tool
- Depends on: P0.8.1
- Input: eta-proof-store-extraction.md (finalization section)
- Output:
src/tools/merkle.ts,tests/tools/merkle.test.ts - Acceptance criteria:
merkle_finalizeMCP tool: builds Merkle tree of last N unfinalized records, stores rootmerkle_rootMCP tool: returns current root hash + record count + timestampaudit_session_startMCP tool: creates audit session record, returns session_id- Finalization must happen AFTER final thought record (enforced: errors if no thought_record in session)
- Test: finalize 5-record session → root matches manual computation
- Effort: S
P0.9 — ν Integrations
P0.9.1 — MCP Bridge
- Depends on: P0.2.1
- Input: nu-integrations-extraction.md (bridge section)
- Output:
src/domains/integrations/mcp-bridge.ts,tests/domains/integrations/mcp-bridge.test.ts - Acceptance criteria:
McpBridge: wraps outbound MCP client calls to external serversconnectToServer(url): creates client, returns connected bridgecallTool(bridge, name, args): calls remote tool, returns result- Timeout: 30s default, configurable via
COLIBRI_MCP_TIMEOUT - Retry: 3 attempts with exponential backoff on transient errors
- Test: mock MCP server → verify roundtrip tool call
- Effort: M
P0.9.2 — Claude API Wrappers
- Depends on: P0.1.4
- Input: nu-integrations-extraction.md (Claude API section)
- Output:
src/domains/integrations/claude.ts,tests/domains/integrations/claude.test.ts - Acceptance criteria:
createCompletion(prompt, options): calls Anthropic API with configured modelcreateCompletionWithTools(prompt, tools, options): tool-use completion- API key from
ANTHROPIC_API_KEYenv var (declared.optional()in the Zod schema; validated at call-time bycreateCompletion/createCompletionWithTools, which throwAnthropicConfigErrorif absent. This is Design Invariant 5 — the server boots cleanly when the key is unset for deployments that don’t use the Claude API integration. R75 Wave H reconciled this acceptance criterion with the shipped code insrc/config.ts:79+src/domains/integrations/claude.ts.) - Rate limit handling: 429 → exponential backoff, max 3 retries
- All API calls logged with: model, prompt_tokens, completion_tokens, latency_ms
- Test (mock): verify retry logic and logging
- Effort: M
P0.9.3 — Notification Channels
- Depends on: P0.2.1
- Input: nu-integrations-extraction.md (notifications section)
- Output:
src/domains/integrations/notifications.ts,tests/domains/integrations/notifications.test.ts - Acceptance criteria:
notify(event, payload): dispatches event to configured channels- Channels:
log(always on),mcp(MCP notification),webhook(optional) COLIBRI_WEBHOOK_URLenv var enables webhook channel (Phase 0 uses theCOLIBRI_*namespace;AMS_*is not read)- Events:
task.completed,merkle.finalized,error.critical(noagent.spawned— agent runtime is deferred per ADR-005) - Fire-and-forget: notification failures do not block main execution
- Test: verify each channel receives correct payload for each event type
- Effort: S
How to Read This Document
Each task has:
- ID:
P{phase}.{subtask}.{step}(e.g., P1.2.3) - Depends on: which tasks must be complete first
- Input: what the agent reads before starting
- Output: exact files to create or modify
- Acceptance criteria: testable conditions (pass/fail)
- Estimated effort: S (1-2h), M (4-8h), L (1-2d), XL (3-5d)
Phase 1: κ Rule Engine
Phase 1 starts at R81 per docs/5-time/roadmap.md. Ready-to-paste agent prompts for every sub-task below live in task-prompts/p1.1-kappa-rule-engine.md (shipped R76.P1, 2026-04-18).
Structural overview: 5 groups × 20 sub-tasks. All output paths target src/domains/rules/... (Phase 0 convention). Concept reference: docs/3-world/physics/laws/rule-engine.md. Algorithm extraction: docs/reference/extractions/kappa-rule-engine-extraction.md.
P1.1 — Integer Math Library (3 sub-tasks)
P1.1.1 — Basis Point Arithmetic
- Depends on: nothing
- Input: docs/3-world/physics/laws/rule-engine.md (Integer-only arithmetic section), docs/reference/extractions/kappa-rule-engine-extraction.md §3–4
- Output:
src/domains/rules/integer-math.ts,src/domains/rules/__tests__/integer-math.test.ts - Acceptance criteria:
- All arithmetic uses 64-bit signed integers, no floating point anywhere
bps_mul(value, bps)→(value * bps) / 10000(floor division)bps_div(value, bps)→(value * 10000) / bps(floor division)apply_bps(value, bps)→value - bps_mul(value, bps)(decay variant)decay(value, rate_bps, epochs)→ multi-epoch compounded decay with per-step floor- Overflow detection: reject inputs where
value * bpswould exceed 2^63 - 1 - Underflow: result never goes below 0 for non-negative inputs
- Division by zero: explicit error, not silent wrap
- 100% branch coverage in tests
- Effort: S
P1.1.2 — Determinism Verification Harness
- Depends on: P1.1.1
- Input: rule-engine.md (Forbidden operations section)
- Output:
src/domains/rules/__tests__/determinism.test.ts - Acceptance criteria:
- Property test: for any two runs with identical inputs, outputs are bit-identical
- No
Math.random(),Date.now(),process.hrtime(), or equivalent - No async I/O in computation path
- Fuzz test: 10,000 random input pairs produce identical results under both call orderings
- Static analysis: grep check rejects any use of
Math.*orDate.*outside tests
- Effort: S
P1.1.3 — BPS Constants + Overflow Protection
- Depends on: P1.1.1
- Input: rule-engine.md (basis-point conventions), extraction §3 (BPS Constants) + §4 (Overflow Protection)
- Output:
src/domains/rules/bps-constants.ts,src/domains/rules/__tests__/bps-constants.test.ts - Acceptance criteria:
- Exported constants:
BPS_100_PERCENT=10000,BPS_50_PERCENT=5000,BPS_1_PERCENT=100 - Domain decay rates:
DECAY_EXECUTION=500,DECAY_COMMISSIONING=300,DECAY_ARBITRATION=1000,DECAY_GOVERNANCE=200,DECAY_SOCIAL=100 - Penalty constants:
DAMAGE_MINOR=1500,DAMAGE_MODERATE=3000,DAMAGE_SEVERE=5000,DAMAGE_CRITICAL=8000,DAMAGE_FRAUD=10000 safe_mul(a, b)returns typed overflow error when|a| > MAX_INT64 / |b|safe_div(a, b)returns typed divide-by-zero error whenb == 0- Constants are
as const, notletor mutable
- Exported constants:
- Effort: S
P1.2 — DSL Parser (4 sub-tasks)
P1.2.1 — Lexer / Tokenizer
- Depends on: nothing
- Input: rule-engine.md (DSL grammar section), extraction §1 (Full EBNF Grammar), ADR-006-dsl-grammar.md
- Output:
src/domains/rules/lexer.ts,src/domains/rules/__tests__/lexer.test.ts - Acceptance criteria:
- Uses Chevrotain library (pinned in package.json per ADR-006)
- Token types: KEYWORD, IDENTIFIER, INTEGER, STRING, OPERATOR, DELIMITER, EOF
- Keywords:
rule,guards,effects,when,then,if,else,and,or,not,true,false,admit,reject,admission,transition,consequence,promotion - Operators:
==,!=,>,<,>=,<=,+,-,*,/,% - Variables: start with
$, dot-path dereference$actor.reputation.execution - Line/column tracking for error messages
- Rejects floating-point literals (e.g.,
3.14is a syntax error) - Rejects underscore-separated integer literals (
1_000_000invalid) - Unicode identifiers supported (for future i18n)
- Effort: M
P1.2.2 — Parser (Tokens → AST)
- Depends on: P1.2.1
- Input: rule-engine.md (DSL grammar), extraction §1 (EBNF) + §2 (AST Node Types), ADR-006
- Output:
src/domains/rules/parser.ts,src/domains/rules/__tests__/parser.test.ts - Acceptance criteria:
- Chevrotain parser built on top of P1.2.1 lexer
- Parses the 4 rule types: Admission, StateTransition, Consequence, Promotion
- AST node types match extraction §2: RuleNode, GuardClause, EffectCall, BinaryOp, UnaryOp, LogicalOp, IntLiteral, BoolLiteral, StringLiteral, VarRef, FuncCall
- Operator precedence: NOT > AND > OR;
*///%>+/-; comparison > logical - Guard blocks:
guards { <clauses> }; each clause(Expression | else) -> (admit | reject STRING) - Effect blocks:
effects { <calls> }; each callIDENTIFIER ( ArgList ) - Error recovery:
recoveryEnabled: true; reports first 5 errors, doesn’t crash on malformed input - Round-trip test: parse → serialize → parse produces identical AST
- AST cap enforcement: rejects any single rule with > 10,000 AST nodes at parse time
- Effort: L
P1.2.3 — AST Validator
- Depends on: P1.2.2
- Input: rule-engine.md (Forbidden operations table)
- Output:
src/domains/rules/validator.ts,src/domains/rules/__tests__/validator.test.ts - Acceptance criteria:
- Rejects rules that read local state (clock, filesystem, network, process)
- Rejects rules with randomness (except VRF-input references —
$vrf_output) - Rejects rules with side effects (HTTP calls, file writes, stdout)
- Rejects rules that mutate input events
- Type checking: operands compatible with operators (no
int + string) - Scope checking: variables defined before use
- Cycle detection: no infinite recursion in rule references
- Axiom pre-check: rejects rules that violate AX-01–AX-07 at load time (constitutional axioms)
- Effort: M
P1.2.4 — Rule Loader / Registry
- Depends on: P1.2.3
- Input: rule-engine.md (Rule application algorithm), extraction §8 (process_action)
- Output:
src/domains/rules/registry.ts,src/domains/rules/__tests__/registry.test.ts - Acceptance criteria:
loadRuleset(source: string): RuleRegistry— parses + validates + indexes a source file- Registry sorts rules by specificity: (a) guard term count descending, (b) declaration order
- Specificity ties at load time → explicit
AmbiguousRulesetError— refuse boot getRule(name: string): RuleNode | null— named lookupgetByTransitionType(type: TransitionType): RuleNode[]— indexed lookup by one of the 13 transition types from extraction §7computeVersionHash(): string— delegates to P1.5.1 canonical serializer- Load-time error aggregation: reports all validator errors in one pass, not just first
- Effort: M
P1.3 — Deterministic Interpreter (4 sub-tasks)
P1.3.1 — Core Evaluation Loop
- Depends on: P1.2.2, P1.1.1
- Input: rule-engine.md (Rule application algorithm, Evaluation budget), extraction §5 (Rule Execution Flow)
- Output:
src/domains/rules/engine.ts,src/domains/rules/__tests__/engine.test.ts - Acceptance criteria:
- Evaluates AST nodes recursively with an immutable context
- Rule execution order: Admission → StateTransition → Consequence → Promotion (fixed)
- Within each category: alphabetical by rule name (stable ordering)
- First-match-wins: once a guard matches, remaining guards in the same rule are skipped
- Context contains: event, current_state (read-only snapshot), rule_version, epoch, actor binding
- Returns: list of
{type, target, field, old_value, new_value}mutations - No mutations applied during evaluation (collect-then-apply pattern)
- Timeout:
MAX_INTEGER_OPS=10_000— abort withRuleBudgetExceeded("integer_ops") - Depth cap:
MAX_CALL_DEPTH=16— abort withRuleBudgetExceeded("call_depth") - Arg cap:
MAX_ARG_COUNT=8— abort withRuleBudgetExceeded("arg_count")
- Effort: L
P1.3.2 — Built-in Functions
- Depends on: P1.3.1, P1.1.1
- Input: rule-engine.md (Built-in functions table), extraction §3 (8 Built-in Functions)
- Output:
src/domains/rules/builtins.ts,src/domains/rules/__tests__/builtins.test.ts - Acceptance criteria:
min(a, b),max(a, b),abs(a),cap(v, m)— integer onlyclamp(v, lo, hi)—max(lo, min(v, hi))isqrt(n)— Newton’s method integer square root (from extraction §3 pseudocode)ilog2(n)— integer floor of log base 2decay(v, rate_bps)— single-epoch decay; delegates to integer-math librarydiminishing(v, k)—(v * k) / (k + v)diminishing-returns transformbps_mul(v, b),bps_div(v, b)— delegate to P1.1.1hash(data)— SHA-256 hex stringvrf_verify(pk, proof, input)— VRF proof verification per ADR-002- All functions are pure (no side effects, same input = same output)
- Each function counts as 1 or more integer ops against the evaluation budget (documented table)
- Effort: M
P1.3.3 — State Access Layer
- Depends on: P1.3.1
- Input: rule-engine.md (State Access Pattern), extraction §10 (ReadOnlyState Interface)
- Output:
src/domains/rules/state-access.ts,src/domains/rules/__tests__/state-access.test.ts - Acceptance criteria:
- Read-only state snapshot provided to rules (frozen object or copy-on-write proxy)
- State keys:
reputation[node][domain],tokens[node],stake[node],epoch,event_count,fork_id,rule_version with_binding(name, value)returns new context; original unchanged- No direct database access from rules — state is pre-loaded by the host
- State diff output:
{key, old_value, new_value}for each mutation - Merkle proof generation for state reads (verifiable by other nodes) — hooks into η
- Mutation attempts throw
ReadOnlyStateErrorimmediately (fail-fast)
- Effort: M
P1.3.4 — Policy Gating / Pre-guards
- Depends on: P1.3.1
- Input: extraction §9 (Policy Gating: check_policy)
- Output:
src/domains/rules/policy-gate.ts,src/domains/rules/__tests__/policy-gate.test.ts - Acceptance criteria:
- Policy enum P1–P13 per extraction §9
- Each policy is a pure DSL expression (reuses P1.2.2 parser + P1.3.1 evaluator)
check_policy(id, actor, context)→{admitted, reason?}check_all_policies(action, actor, context)→ short-circuits on first failure- Policies run BEFORE named rule evaluation (pre-guards)
- Policies share evaluation budget with named rules (same 10k op cap)
- Each policy has rejection reason pre-registered (no dynamic strings)
- Effort: M
P1.4 — Admission Layer (4 sub-tasks)
P1.4.1 — Admission Evaluator
- Depends on: P1.3.1, P1.3.4, P1.2.4
- Input: rule-engine.md (Admission layer), docs/spec/s10-admission.md
- Output:
src/domains/rules/admission.ts,src/domains/rules/__tests__/admission.test.ts - Acceptance criteria:
evaluateAdmission({caller, tool, mode, rep_snapshot, rule_version}): AdmissionResult- Returns
{admitted: true, effect_mutations: [...]} | {admitted: false, reason: DenialReason} - Runs policy pre-guards first (P1.3.4), then named rules (P1.3.1)
- Rule version stamp on every admission record
- Pure function — no DB writes, no network calls
- Timing independent of input values (constant-time comparison for sensitive fields)
- Integration test: ≥20 representative (caller, tool, mode) tuples with expected verdicts
- Effort: L
P1.4.2 — Denial Reason Taxonomy
- Depends on: P1.4.1
- Input: rule-engine.md (Rule application algorithm return values), extraction §5 (Rule Execution Flow)
- Output:
src/domains/rules/denial-reasons.ts,src/domains/rules/__tests__/denial-reasons.test.ts - Acceptance criteria:
- Typed discriminated union:
no_rule_matched,budget:integer_ops,budget:call_depth,budget:arg_count,effect_invariant_violated,axiom_violation:AX-01..AX-07,policy:P1..P13,rule_version_mismatch,ambiguous_ruleset - Each reason carries a structured
detailspayload (no freeform strings) - Reason codes stable across upgrades (additive-only changes; no renumbering)
toString(reason)produces operator-readable rendering- JSON serialization preserves discriminant tag
- Typed discriminated union:
- Effort: S
P1.4.3 — Admission Budgets
- Depends on: P1.3.1
- Input: rule-engine.md (Evaluation budget, Default budget constants)
- Output:
src/domains/rules/budget.ts,src/domains/rules/__tests__/budget.test.ts - Acceptance criteria:
- Budget tracker class with counters:
integer_ops,call_depth,current_arg_count - Limits:
MAX_INTEGER_OPS=10_000,MAX_CALL_DEPTH=16,MAX_ARG_COUNT=8(constants from rule-engine.md) - On exceed: throw
RuleBudgetExceededwith which-counter-fired field - Instrumentation hooks: emit
budget.tickevents for α’s audit layer (count only, no payload) - Budget state is reset per-rule (no leaking across rules in a ruleset)
- Limits part of the rule version hash (P1.5.1) — changing them forces a new version
- Budget tracker class with counters:
- Effort: M
P1.4.4 — Tool-Lock Integration Spec
- Depends on: P1.4.1, P1.4.2, P1.4.3
- Input: rule-engine.md (Admission layer), docs/2-plugin/middleware.md (5-stage wrapper at α), docs/spec/s10-admission.md
- Output:
src/domains/rules/tool-lock-adapter.ts,src/domains/rules/__tests__/tool-lock-adapter.test.ts - Acceptance criteria:
createToolLockAdapter(ruleRegistry): MiddlewareStage— factory- Output is a stage-1 middleware function signature matching α’s 5-stage wrapper contract (
tool-lock → schema-validate → audit-enter → dispatch → audit-exit) - Admission denials short-circuit the middleware chain (stages 2–5 skipped)
- Denials emit structured event to audit layer before returning
- Integration test: wire a test ruleset into a test server; verify admission decisions end-to-end
- Zero registration with server boot in R76 — this sub-task lands the adapter; α’s
src/server.tswiring is a separate R81+ PR
- Effort: M
P1.5 — Governance / Rule Versioning (5 sub-tasks)
P1.5.1 — Version Hash Computation
- Depends on: P1.2.2, P1.5.4
- Input: rule-engine.md (Rule versioning section)
- Output:
src/domains/rules/versioning.ts,src/domains/rules/__tests__/versioning.test.ts - Acceptance criteria:
computeVersionHash(ruleset, engine_version): string— returns hex SHA-256- Hash input:
canonical_serialization(all_rules) || engine_version - Canonical serialization via P1.5.4 (sorted-key deterministic JSON)
- Version stored in event metadata:
{rule_version: "sha256:abc..."} - Version mismatch detection: events with wrong
rule_versionare rejected via P1.4.2 taxonomy coderule_version_mismatch - Test: two logically-equivalent but differently-ordered rulesets produce identical hash (canonical property)
- Effort: S
P1.5.2 — Rule Migration
- Depends on: P1.5.1, P1.3.1, P1.5.5
- Input: rule-engine.md (Test corpus parity requirement), docs/3-world/physics/enforcement/governance.md (π versioning section, when landed)
- Output:
src/domains/rules/migration.ts,src/domains/rules/__tests__/migration.test.ts - Acceptance criteria:
migrateRuleset(old, new, corpus): MigrationResult— runs parity harness (P1.5.5)- Test corpus: ≥100 representative events
- Activation epoch: new rules take effect at epoch N+1 (not immediately)
- Parity requirement:
h_old == h_newfor every corpus event both versions admit - Divergence set must match proposal’s declared scope or migration is rejected
- Rollback: if parity fails, old ruleset remains active and proposal is marked
rejected:parity - Fork trigger: nodes that reject migration automatically fork (link to ι — deferred to Phase 5 wiring)
- Effort: L
P1.5.3 — Activation Epoch + Rollback
- Depends on: P1.5.1, P1.5.2
- Input: rule-engine.md (Rule versioning), docs/spec/s11-rule-engine.md
- Output:
src/domains/rules/activation.ts,src/domains/rules/__tests__/activation.test.ts - Acceptance criteria:
scheduleActivation(new_version, target_epoch): ActivationToken— target_epoch must becurrent_epoch + 1minimumapplyActivation(token, current_epoch): void— applies only whencurrent_epoch >= target_epochrollback(version)— reinstates prior version; emits rollback event- Activation journal: append-only log of
(epoch, version_hash, cause)tuples - Rollback does not retroactively invalidate events admitted under rolled-back version — those stand
- Rollback during dispute window triggers π governance review hook (hook name only — π not implemented)
- Effort: M
P1.5.4 — Canonical Serialization
- Depends on: P1.2.2
- Input: rule-engine.md (Rule versioning: “canonical serialization of the rule bodies”)
- Output:
src/domains/rules/canonical.ts,src/domains/rules/__tests__/canonical.test.ts - Acceptance criteria:
canonicalize(ast_or_ruleset): string— produces byte-identical output on any platform- Keys sorted alphabetically at every object level
- No whitespace (single-line JSON)
- Integer literals preserved exactly (no
1e3normalization, no leading zeros) - String escapes use canonical JSON form (
\",\n,\u00XX) - Property test:
canonicalize(parse(canonicalize(parse(x)))) == canonicalize(parse(x))— idempotent round-trip - No locale dependence (sort uses codepoint order, not locale-aware collation)
- Effort: M
P1.5.5 — Test Corpus Parity Harness
- Depends on: P1.3.1, P1.5.1
- Input: rule-engine.md (Test corpus parity requirement)
- Output:
src/domains/rules/parity-harness.ts,src/domains/rules/__tests__/parity-harness.test.ts - Acceptance criteria:
runParity({old_ruleset, new_ruleset, corpus}): ParityReport- Per event: compute effect-set hash
h = SHA-256(canonical(effects))under both versions - Report categorizes events:
both_admit_same,both_admit_diverge,old_admit_new_reject,old_reject_new_admit,both_reject - Pass condition:
both_admit_divergeset is empty AND (old_admit_new_reject∪old_reject_new_admit) ⊆ declared scope - Default corpus of ≥100 hand-curated events shipped with the harness
- Deterministic: identical inputs → identical report bytes
- Performance: runs 10k corpus events in < 5 seconds (for CI feedback speed)
- Effort: L
Phase 2: λ Reputation
P2.1 — Domain Structure
P2.1.1 — Reputation Record Schema
- Depends on: P1.1.1
- Input: docs/concepts/λ-reputation.md, docs/guides/implementation/lambda-reputation.md
- Output:
src/domains/reputation/schema.{ext}, database migration - Acceptance criteria:
- 5 domains: execution, commissioning, arbitration, governance, social
- Per record: node_id, domain, score (integer bps 0-10000), scars (bitmask), ban_until_epoch, last_activity_epoch
- History table: node_id, domain, epoch, delta, reason, event_id
- Indexes on (node_id, domain) and (domain, score DESC)
- Effort: S
P2.1.2 — Score Computation
- Depends on: P2.1.1, P1.3.1
- Input: λ-reputation.md (Computation section)
- Output:
src/domains/reputation/compute.{ext},tests/domains/reputation/compute.test.{ext} - Acceptance criteria:
compute_score(node_id, domain, events[])→ integer score- Score = Σ(acknowledgement_weight × event_outcome) for all events in domain
- Uses integer-math library for all arithmetic
- Score capped at 10000 bps (100%) minus scar penalties
- Property test: score is monotonically non-decreasing with only positive events
- Effort: M
P2.2 — Decay and Penalties
P2.2.1 — Exponential Decay
- Depends on: P2.1.1, P1.1.1
- Input: λ-reputation.md (Decay section), implementation guide
- Output:
src/domains/reputation/decay.{ext},tests/domains/reputation/decay.test.{ext} - Acceptance criteria:
- Decay applied per-epoch for inactive nodes
- Rate per domain: execution=500bps, commissioning=300bps, arbitration=1000bps, governance=200bps, social=100bps
- Formula:
new_score = score - apply_bps(score, decay_rate)per inactive epoch - Activity in domain resets that domain’s decay counter
- Batch processing: efficient for 10,000+ nodes per epoch
- Floor: score cannot go below 0
- Effort: M
P2.2.2 — Offense Penalties
- Depends on: P2.1.1, P1.1.1
- Input: λ-reputation.md (Offense Penalties section)
- Output:
src/domains/reputation/penalties.{ext},tests/domains/reputation/penalties.test.{ext} - Acceptance criteria:
- Penalty table: minor=1500bps, moderate=3000bps, severe=5000bps, critical=8000bps, fraud=10000bps
- Scar mechanism: fraud adds permanent cap reduction (score can never exceed 10000 - scar_bps)
- Ban mechanism: critical+ offense bans from arbitration for N epochs
- Double jeopardy protection: same event cannot trigger same penalty twice
- Recovery path: after ban expires, node starts at scar-limited maximum
- Effort: M
P2.3 — Experience Tokens
P2.3.1 — Token Levels and Minting
- Depends on: P2.1.1
- Input: λ-reputation.md (Experience Tokens section), implementation guide
- Output:
src/domains/reputation/tokens.{ext},tests/domains/reputation/tokens.test.{ext} - Acceptance criteria:
- 5 levels: L0 (Raw), L1 (Episode), L1.5 (Witness), L2a (Correlation), L2b (Proto-causal)
- L0: auto-minted on event completion
- L1: requires interaction cycle (commit → deliver → confirm)
- L1.5: requires witnessing 3+ disputes as observer
- L2a: requires 5+ repetitions of same interaction pattern
- L2b: requires L2a + context diversity (3+ categories) + path diversity (3+ counterparties)
- Tokens are non-transferable, bound to node identity
- Token count per node queryable by domain
- Effort: L
P2.4 — Derived Limits
P2.4.1 — Capability Gates
- Depends on: P2.1.2, P1.3.2
- Input: λ-reputation.md (Derived Limits section)
- Output:
src/domains/reputation/limits.{ext},tests/domains/reputation/limits.test.{ext} - Acceptance criteria:
max_parallel_tasks(rep)=min(sqrt_floor(rep), 20)rate_limit_bonus(rep)=base_rate * log2_floor(max(rep, 1))stake_discount(rep)=required_stake * 10000 / max(rep, 1000)(bps math)can_arbitrate(rep)=rep.arbitration >= 5000 AND rep.execution >= 3000can_govern(rep)=rep.governance >= 4000- All computations use integer-math library
- Effort: M
P2.5 — MCP Tool Surface
P2.5.1 — Reputation Query Tools
- Depends on: P2.1.2, P2.2.1, P2.2.2, P2.3.1, P2.4.1, P0.3.4
- Input: λ-reputation.md, docs/guides/implementation/lambda-reputation.md
- Output:
src/domains/reputation/tools.{ext},tests/domains/reputation/tools.test.{ext} - Acceptance criteria:
reputation_get(node_id, domain?)→ score, scars, ban_until, last_activity per domainreputation_history(node_id, domain, limit)→ paginated history eventsreputation_leaderboard(domain, limit)→ top N nodes by scorereputation_check_gates(node_id)→ capability gate results (can_arbitrate, can_govern, max_parallel_tasks)- All tools registered as MCP tools via tool registry (ε)
- Integration test: create node → apply events → verify score matches hand-calculation
- Effort: S
Phase 3: θ Consensus
P3.1 — BFT Voting
P3.1.1 — Vote Message Types
- Depends on: P1.4.1
- Input: docs/concepts/θ-consensus.md, docs/guides/implementation/theta-consensus.md
- Output:
src/consensus/messages.{ext},tests/consensus/messages.test.{ext} - Acceptance criteria:
- Message types: PROPOSE, VOTE, COMMIT, VIEW_CHANGE, CHECKPOINT
- Vote types: ACCEPT, REJECT, ABSTAIN
- All messages signed with Ed25519
- Message fields: sender, type, round, payload, signature, timestamp
- Serialization: canonical JSON (deterministic key order)
- Deserialization validates all required fields
- Effort: M
P3.1.2 — Quorum Computation
- Depends on: P3.1.1
- Input: θ-consensus.md (quorum math), BFT extraction
- Output:
src/consensus/quorum.{ext},tests/consensus/quorum.test.{ext} - Acceptance criteria:
quorum_threshold(n)=floor(2 * n / 3) + 1max_faulty(n)=floor((n - 1) / 3)has_quorum(votes, n)=count(votes.accept) >= quorum_threshold(n)- Equivocation detection: same node signs contradicting votes → generate proof
- Proof format:
{node_id, vote_1, vote_2, round}— cryptographic evidence - Property test: for n >= 4, quorum of honest nodes always overlaps
- Effort: M
P3.1.3 — View Change Protocol
- Depends on: P3.1.2
- Input: θ-consensus.md (view change section), theta extraction
- Output:
src/consensus/view-change.{ext},tests/consensus/view-change.test.{ext} - Acceptance criteria:
- Trigger: primary unresponsive for 2× expected round duration
- New primary selection: deterministic rotation (round % n)
- View change message carries highest committed state
- New primary must prove it has the latest committed state
- Timeout doubles each failed view change (exponential backoff)
- Anti-thrashing: minimum 3 rounds before another view change
- Effort: L
P3.2 — Finality Levels
P3.2.1 — Finality State Machine
- Depends on: P3.1.2
- Input: θ-consensus.md (Finality Levels), implementation guide
- Output:
src/consensus/finality.{ext},tests/consensus/finality.test.{ext} - Acceptance criteria:
- States: PENDING → SOFT → QUORUM → HARD → ABSOLUTE
- PENDING → SOFT: first vote received
- SOFT → QUORUM: votes >= quorum_threshold(n)
- QUORUM → HARD: dispute window (100 epochs) elapsed without challenge
- HARD → ABSOLUTE: appeal window elapsed, fully irreversible
- No external side effects (payments, exports) before HARD
- State transitions are monotonic: cannot go backward
- Each transition recorded with epoch and evidence
- Effort: L
P3.3 — Gossip Protocol
P3.3.1 — IHAVE/IWANT Messages
- Depends on: P3.1.1
- Input: θ-consensus.md (Gossip), implementation guide, theta extraction
- Output:
src/consensus/gossip.{ext},tests/consensus/gossip.test.{ext} - Acceptance criteria:
- IHAVE:
{event_ids[], state_root, rule_version, fork_id} - IWANT:
{event_ids[]}— request specific events - Bloom filter for deduplication (false positive rate < 1%)
- Adaptive fanout: well-connected nodes gossip to fewer peers
- Triple-Anchor validation: reject messages where rule_version, state_root, or fork_id don’t match
- Bandwidth budget: max N bytes/second per peer connection
- IHAVE:
- Effort: L
P3.4 — Time Anchors
P3.4.1 — Signed Timestamps
- Depends on: P3.1.1, P2.1.1
- Input: θ-consensus.md (Time Anchors), implementation guide
- Output:
src/consensus/time-anchors.{ext},tests/consensus/time-anchors.test.{ext} - Acceptance criteria:
- Eligible publishers: top N arbiters by arbitration reputation
- Anchor format:
{publisher, timestamp_ms, epoch, signature} - Median computation: collect anchors from last K epochs, take median
- Drift detection:
|local_clock - median| > 30_000ms→ deprioritize proposals - Monotonicity: anchors from same publisher must be non-decreasing
- Replay protection: anchors with epoch < current_epoch - 10 are rejected
- Effort: M
P3.5 — Slashing
P3.5.1 — Equivocation Enforcement
- Depends on: P3.1.2, P2.2.2
- Input: θ-consensus.md (Slashing Conditions), theta extraction, ADR-003
- Output:
src/consensus/slashing.{ext},tests/consensus/slashing.test.{ext} - Acceptance criteria:
apply_equivocation_slash(proof)→ calls reputation penalty (P2.2.2) for double-signing node- Proof verification: check both contradicting votes carry valid signatures from same node
- Slash amount: maps to
criticaloffense (8000bps loss) in reputation penalty table - Idempotent: same equivocation proof applied twice must not slash twice (proof hash dedup)
- Slashing recorded in reputation history table with event_id = proof hash
- Integration test: create equivocation → verify slash applied → verify idempotency
- Effort: M
Phase 4: μ Integrity Monitor
P4.1.1 — Coercion Trap Detection
- Depends on: P1.3.1, P2.1.2
- Input: docs/concepts/μ-integrity-monitor.md
- Output:
src/domains/integrity/coercion-detection.ts,tests/domains/integrity/coercion-detection.test.ts - Acceptance criteria:
- Enumerate all legal actions for a participant given current state
- For each action, compute outcome via rule engine
- Flag if: all outcomes negative, or action space is empty
- Severity levels: INFO, WARNING, CRITICAL
- Advisory output:
{check, result, severity, details, evidence, reasoning_trace} - No veto power: detection is advisory only, cannot block actions
- Effort: L
P4.2.1 — Three Advisory Roles
- Depends on: P4.1.1
- Input: μ-integrity-monitor.md (advisory roles section)
- Output:
src/domains/integrity/advisory-roles.ts,tests/domains/integrity/advisory-roles.test.ts - Acceptance criteria:
- Translator: sanitize natural language → structured commands
- Sentinel: scan events for injection, coercion, axiom drift
- Guide: explain reputation scores, available actions, consequences
- All three are strictly read-only
- Standard output format across all roles
- Effort: L
Phase 5: ι Fork Protocol
P5.1.1 — Fork ID and Creation
- Depends on: P3.1.2, P1.4.1
- Input: docs/concepts/ι-state-fork.md, theta extraction (fork sections)
- Output:
src/domains/fork/index.{ext},tests/domains/fork/index.test.{ext} - Acceptance criteria:
-
[ ] Fork ID = SHA-256(parent_fork_id divergence_event_id rule_hash reason) - Auto triggers: rule conflict, invariant violation, constitutional violation
- Manual triggers: voluntary exit, governance rejection
- Isolation modes: ISOLATED (no data flow), READ_ONLY_PARENT (read parent, can’t write), BRIDGED (selective sync)
- Fork-scoped state: event log, reputation, tokens, BFT state — all copied at fork point
-
- Effort: XL
P5.2.1 — Checkpoint Protocol
- Depends on: P5.1.1
- Input: ι-state-fork.md (checkpoint section)
- Output:
src/domains/fork/checkpoints.{ext},tests/domains/fork/checkpoints.test.{ext} - Acceptance criteria:
- Frequency: every 1000 events OR 100 epochs (whichever first)
- Signers: top 10 arbiters by reputation, threshold 7/10
- Content: fork_id, epoch, event_count, state_root, reputation_snapshot, rule_version_hash
- Fast sync: new nodes download checkpoint + post-checkpoint events
- Checkpoint chain: each checkpoint references previous checkpoint hash
- Effort: L
P5.3.1 — Fork Merge
- Depends on: P5.1.1, P5.2.1
- Input: ι-state-fork.md (merge section), theta extraction (fork merge)
- Output:
src/domains/fork/merge.{ext},tests/domains/fork/merge.test.{ext} - Acceptance criteria:
- Find common ancestor fork point
- Compute event diff between forks
- Conflict detection: same state key modified in both forks
- Resolution strategies: timestamp ordering (default), reputation-weighted voting, governance vote
- Rule conflicts: cannot auto-merge, require governance vote
- Reputation discount on transition: 50% of source fork reputation (configurable)
- Both forks must reach quorum agreement for merge to finalize
- Effort: XL
Phase 6: π Governance
P6.1.1 — Proposal Lifecycle
- Depends on: P1.3.1, P2.1.1, P3.1.2
- Output:
src/domains/governance/proposals.{ext},tests/domains/governance/proposals.test.{ext} - Acceptance criteria:
- Proposal types: AX (constitutional), PR (protected rule), GOV (protocol rule)
- Voting: AX/PR require >80% supermajority, GOV requires >66% quorum
- AX changes require 3-stage time-locked votes (30-day intervals)
- Automatic activation at activation_epoch after vote passes
- Appeal mechanism with cooldown period
- Effort: XL
P6.2.1 — Governance Limits
- Depends on: P6.1.1
- Input: MASTER-TASKS.md P6.2 section
- Output:
src/domains/governance/limits.{ext},tests/domains/governance/limits.test.{ext} - Acceptance criteria:
- Max delta: ±10% per 6 months for any numeric parameter
- Constitutional pegs: ±30% from genesis requires 3-stage supermajority
- Cooldown: 1 epoch between changes to same parameter
- Stability: max 2 parameters per domain changed simultaneously
- Entropy injection: for >5% delta, 10% of votes (VRF-selected) count as equal-weight
- Effort: L
P6.3.1 — Axiom Enforcement
- Depends on: P6.1.1, P4.1.1
- Input: promises.md (system guarantees), S01 spec (7 constitutional axioms)
- Output:
src/governance/axiom-enforcement.{ext},tests/governance/axiom-enforcement.test.{ext} - Acceptance criteria:
- AX-01 (Append-only): all delete operations rejected; corrections via new events
- AX-02 (Derived reputation): no admin reset; only rule engine can change reputation
- AX-03 (No absolute authority): all roles subject to consequences
- AX-04 (Consequence windows): sanction intent → admission → voluntary → automated
- AX-05 (Subjective finality): per local rule engine, not global consensus
- AX-06 (Right to exit): fork allowed; penalty capped at 10%
- AX-07 (Technical sovereignty): row-level security, no cross-workspace leakage
- Each axiom has a guard function:
check_axiom_N(proposed_action) → {pass, violation_details}
- Effort: L
Phase 7: ξ Identity — Digital Soul Vector
Status: Spec-only. No implementation tasks until Phase 6 is complete. Concept:
docs/3-world/social/identity.mdExtraction:docs/reference/extractions/xi-identity-extraction.md
Phase 7 implements the Digital Soul Vector — a persistent identity fabric for agents and nodes.
P7.1.1 — Identity Schema
- Depends on: P0.2.2, P2.1.1, P3.1.2
- Input: docs/concepts/ξ-identity.md, xi-identity-extraction.md
- Output:
src/domains/identity/schema.ts,tests/domains/identity/schema.test.ts - Acceptance criteria:
- 8 identity domains: contribution, governance, reputation, behavior, skills, relationships, history, sovereignty
- 7 character traits (immutable at genesis): curiosity, reliability, fairness, courage, wisdom, creativity, empathy
- Identity hash: SHA-256 of canonical genesis record (immutable after creation)
- Soul-Bound Token (SBT) concept: identity cannot be transferred or sold
- Schema stored in
identitiestable with Ed25519 public key as primary identifier
- Effort: L
P7.1.2 — Identity Binding (VRF + Ed25519)
- Depends on: P7.1.1, ADR-002 decision
- Input: xi-identity-extraction.md (binding section), ADR-002-vrf-implementation.md
- Output:
src/domains/identity/binding.ts,tests/domains/identity/binding.test.ts - Acceptance criteria:
- Ed25519 keypair generation:
generateIdentityKeyPair()→{ publicKey, privateKey } - Identity proof:
proveIdentity(privateKey, challenge)→ Ed25519 signature - VRF integration:
generateUnpredictableEntropy(privateKey, epoch)→ VRF output - Binding is permanent: once public key registered, cannot be reassigned
- Test: sign + verify roundtrip; VRF output deterministic for same inputs
- Ed25519 keypair generation:
- Effort: L
P7.1.3 — Soul Vector Accumulation
- Depends on: P7.1.1, P2.1.1, P3.1.2
- Input: xi-identity-extraction.md (accumulation section)
- Output:
src/domains/identity/accumulator.ts,tests/domains/identity/accumulator.test.ts - Acceptance criteria:
- Each completed task updates identity’s contribution domain
- Each governance vote updates governance domain
- Reputation scores flow from λ into identity’s reputation domain
getSoulVector(identityId)→ 8-domain snapshot at current epoch- Soul vector is read-only via API; only rule engine can modify it (AX-02 protection)
- Effort: L
Task Summary
| Phase | Tasks | Effort | Depends on |
|---|---|---|---|
| P0 Bootstrap | 28 tasks | 4-6 weeks | — |
| P1 κ Rule Engine | 10 tasks | 3-4 weeks | P0 |
| P2 λ Reputation | 7 tasks | 2-3 weeks | P0, P1 |
| P3 θ Consensus | 7 tasks | 5-6 weeks | P0, P1, P2 |
| P4 μ Integrity | 2 tasks | 2 weeks | P1, P2, P3 |
| P5 ι Fork | 3 tasks | 3-4 weeks | P0, P3 |
| P6 π Governance | 3 tasks | 3-4 weeks | All |
| P7 ξ Identity | 3 tasks | 3-4 weeks | P0, P2, P3, P6 |
| Total | 63 tasks | 25-32 weeks |
R57 expansion: Phase 0 grew from 9 high-level bullets → 28 granular tasks with acceptance criteria. Total task count: 32 (pre-R57) → 63 tasks (post-R57). Phase 7 ξ Identity defined for the first time.
Dependency Graph (Mermaid)
graph TD
P1.1.1[P1.1.1 Integer Math] --> P1.1.2[P1.1.2 Determinism Harness]
P1.2.1[P1.2.1 Lexer] --> P1.2.2[P1.2.2 Parser]
P1.2.2 --> P1.2.3[P1.2.3 Validator]
P1.2.2 --> P1.3.1[P1.3.1 Eval Loop]
P1.1.1 --> P1.3.1
P1.3.1 --> P1.3.2[P1.3.2 Builtins]
P1.3.1 --> P1.3.3[P1.3.3 State Access]
P1.2.2 --> P1.4.1[P1.4.1 Version Hash]
P1.4.1 --> P1.4.2[P1.4.2 Migration]
P1.3.1 --> P1.4.2
P1.1.1 --> P2.1.1[P2.1.1 Rep Schema]
P2.1.1 --> P2.1.2[P2.1.2 Score Compute]
P1.3.1 --> P2.1.2
P2.1.1 --> P2.2.1[P2.2.1 Decay]
P1.1.1 --> P2.2.1
P2.1.1 --> P2.2.2[P2.2.2 Penalties]
P2.1.1 --> P2.3.1[P2.3.1 Tokens]
P2.1.2 --> P2.4.1[P2.4.1 Limits]
P1.3.2 --> P2.4.1
P1.4.1 --> P3.1.1[P3.1.1 Vote Messages]
P3.1.1 --> P3.1.2[P3.1.2 Quorum]
P3.1.2 --> P3.1.3[P3.1.3 View Change]
P3.1.2 --> P3.2.1[P3.2.1 Finality SM]
P3.1.1 --> P3.3.1[P3.3.1 Gossip]
P3.1.1 --> P3.4.1[P3.4.1 Time Anchors]
P2.1.1 --> P3.4.1
P1.3.1 --> P4.1.1[P4.1.1 Coercion]
P2.1.2 --> P4.1.1
P4.1.1 --> P4.2.1[P4.2.1 Advisory Roles]
P3.1.2 --> P5.1.1[P5.1.1 Fork Create]
P1.4.1 --> P5.1.1
P5.1.1 --> P5.2.1[P5.2.1 Checkpoints]
P5.1.1 --> P5.3.1[P5.3.1 Fork Merge]
P5.2.1 --> P5.3.1
P1.3.1 --> P6.1.1[P6.1.1 Proposals]
P2.1.1 --> P6.1.1
P3.1.2 --> P6.1.1
P6.1.1 --> P6.2.1[P6.2.1 Gov Limits]
P6.1.1 --> P6.3.1[P6.3.1 Axiom Guards]
P4.1.1 --> P6.3.1
Agent Execution Protocol
When an AI agent picks up a task from this list:
- Read the Input files listed for the task
- Create the Output files at the specified paths (adjust extension for chosen stack)
- Run all acceptance criteria as tests
- Record completion via
task_updateandthought_record - Do not start a task whose dependencies are incomplete
- Do not use floating-point arithmetic in any κ/λ/θ computation path
- Do not introduce non-deterministic operations (random, clock, I/O) in rule evaluation