P2 — λ Reputation — Agent Prompts

Copy-paste-ready prompts for agents tackling each of the 7 sub-tasks in Phase 2 λ Reputation.

Phase 2 starts at R91 (next round after the R89 staging round and R90 Phase 1 sealing round) per docs/5-time/roadmap.md. These prompts are staged during R89.B (2026-05-12) so that when R91 opens, every sub-task is a zero-friction T3 dispatch.

Canonical spec: task-breakdown.md §Phase 2 Concept doc: docs/3-world/social/reputation.md Reputation spec: docs/spec/s04-reputation.md Experience tokens spec: docs/spec/s05-experience-tokens.md Arbitration spec (λ-coupling only): docs/spec/s09-arbitration.md Master bootstrap prompt: agent-bootstrap.md Executor rules: CLAUDE.md — §3 worktree, §5 gate, §6 5-step chain, §7 writeback

Design invariants preserved in every sub-task:

  1. 64-bit signed integer arithmetic — no floats anywhere (inherits κ P1.1.1).
  2. Deterministic execution — no RNG / clock / async I/O / network / filesystem in λ computation paths (inherits κ P1.1.2). SHA-256 via crypto.createHash is the only crypto primitive permitted, and only for pattern-matching feature hashes (s05 §Pattern-matching).
  3. Reputation is non-transferable (s04, s05). No code path may copy a reputation row between node_id values.
  4. Token log is append-only (s05 §Append-only vs garbage collection, AX-01). Derived caches may be pruned; the event log is immutable.
  5. Score floor is 0; score ceiling is 10000 - scar_bps per domain. Neither is bypassable.
  6. Scars are permanent (s04 §Permanent scars). A scar row in the schema is never deleted; its weight may be clamped to 0 by arbitration outcome (s05 §Scars supersede; s09 reverses scars only via arbitration appeal).
  7. All penalty applications are rule evaluations, not administrative writes — they flow through the κ rule engine and carry a version hash (reputation.md §Penalty schedule).
  8. Five domains: execution, commissioning, arbitration, governance, social. No sixth domain may be added without a roadmap amendment.
  9. Five token levels in Phase 2: L0, L1, L1.5, L2a, L2b. L3 is deferred to Phase 6 π governance (aggregation across L2b sets — see audit R89.B §3 row 3).
  10. All λ computation lives under src/domains/reputation/. The determinism scanner from κ P1.1.2 must be extended to cover this path.

Scope bound: do not graduate the prompts to dependency-less parallel PRs. The sub-task graph has hard prerequisites; respect the Depends-on field in each section.

Axis goals

λ Reputation is the per-domain, non-transferable history of an agent’s behavior. It is not a single number; it is five independent scalars (one per domain of action), each with its own decay rate and penalty schedule. Phase 2 ships the reputation arithmetic (compute, decay, penalties, scars), the experience-token issuance pipeline (L0 → L2b), the derived capability gates that downstream axes (θ consensus, π governance) consume, and the read-only MCP tool surface that exposes all of the above. Phase 2 does not ship voting credits’ write-side, L3 aggregation, or arbitration write-side — those are Phase 3/6 surfaces.

Dependency map (upstream closure)

λ Phase 2 has zero unmet upstream dependencies:

Upstream Provided by Path Shipped in
BPS integer math bps_mul, bps_div, apply_bps, decay, safe_mul, safe_div src/domains/rules/integer-math.ts R83 (P1.1.1)
BPS named constants DECAY_*, DAMAGE_*, BPS_* src/domains/rules/bps-constants.ts R83 (P1.1.3)
Determinism scanner src/__tests__/domains/rules/determinism.test.ts (extend to reputation/**) R83 (P1.1.2)
Rule evaluator first-match-wins, collect-then-apply src/domains/rules/engine.ts R85 (P1.3.1)
Built-in functions sqrt_floor, log2_floor, etc. src/domains/rules/builtins.ts R86 (P1.3.2)
State access (frozen-Map) read-only with_binding wrapper src/domains/rules/state-access.ts R86 (P1.3.3)
Policy gating P1–P13 pre-guards src/domains/rules/policy.ts R86 (P1.3.4)
Rule loader / registry specificity-ordered src/domains/rules/registry.ts R86 (P1.2.4)
Admission evaluator tool-call gates src/domains/rules/admission.ts R87 (P1.4.1)
Admission budgets per-actor budget tracker src/domains/rules/budgets.ts R87 (P1.4.3)
Denial reason taxonomy extensible enum src/domains/rules/denial-reasons.ts R87 (P1.4.2)
Version hash SHA-256 over canonical JSON src/domains/rules/versioning.ts R86 P1.5.1 / R88.A wire
Tool-lock middleware admission adapter src/middleware/tool-lock.ts R87 (P1.4.4)
β Task pipeline (CRUD + writeback hard-block) src/domains/tasks/ R75 (P0.3.x)
ζ Decision trail src/domains/trail/ R75 (P0.7.x)
η Proof store src/domains/proof/ R75 (P0.8.x)
ε Skill registry src/domains/skills/ R75 (P0.6.x)

P0 closed at 28/28 in R75 Wave I (d5f6a1ff, 2026-04-18). κ Phase 1 closed at 20/20 in R87 (f327936b, 2026-05-07). λ Phase 2 inherits all of the above as load-bearing dependencies.

Ordering rationale (4 waves, 7 tasks)

Sub-tasks form a strict DAG anchored on P2.1.1 (schema):

                        ┌── P2.1.1 (schema; S; foundation; depends on P1.1.1)
                        │
              ┌─────────┼──────────┬──────────┬──────────┐
              ▼         ▼          ▼          ▼          │
       P2.1.2 (M)  P2.2.1 (M)  P2.2.2 (M)  P2.3.1 (L)    │
       score-compute decay     penalties   tokens         │
              │                                           │
              ▼                                           │
       P2.4.1 (M) ◀──────────────── P1.3.2 (κ builtins)
       derived-limits
              │
              └──────────────────────────────────────────▶┐
                                                          │
                                                          ▼
                                                   P2.5.1 (S)
                                                   MCP tools
                                                   ◀─── P0.3.4 (tool binding)
  • Wave 1 — P2.1.1 alone. Foundation gate. Cannot parallelize.
  • Wave 2 — P2.1.2 + P2.2.1 + P2.2.2 in parallel (all depend only on P2.1.1, zero file overlap).
  • Wave 3 — P2.3.1 + P2.4.1 in parallel (P2.3.1 is L; P2.4.1 depends on P2.1.2 from Wave 2).
  • Wave 4 — P2.5.1 alone. Closes λ Phase 2.

R87 shipped 7 κ tasks across 3 waves in ~1h45m. λ adds the foundation gate (Wave 1) and one L-effort task (P2.3.1), so the expected wallclock is ~3–4 hours at R87 pace. The roadmap budget at task-breakdown.md §Task Summary (line ~1136) is 2–3 weeks human-paced; AI-paced compresses substantially.

Roadmap budget reference

task-breakdown.md §Task Summary (around line 1136) declares Phase 2 = 7 tasks / 2–3 weeks / depends on P0 + P1. This prompt file’s seven sub-task entries match those seven roadmap rows 1:1 with no renumbering or scope drift.

Group summary

Task ID Title Depends on Effort Unblocks
P2.1.1 Reputation Record Schema P1.1.1 S P2.1.2, P2.2.1, P2.2.2, P2.3.1
P2.1.2 Score Computation P2.1.1, P1.3.1 M P2.4.1, P2.5.1
P2.2.1 Exponential Decay P2.1.1, P1.1.1 M P2.5.1
P2.2.2 Offense Penalties P2.1.1, P1.1.1 M P2.5.1
P2.3.1 Experience Tokens (L0–L2b) P2.1.1 L P2.5.1
P2.4.1 Capability Gates (derived limits) P2.1.2, P1.3.2 M P2.5.1
P2.5.1 Reputation Query MCP Tools P2.1.2, P2.2.1, P2.2.2, P2.3.1, P2.4.1, P0.3.4 S — (closes λ Phase 2)

Out-of-scope (do not build in Phase 2)

These surfaces are explicitly OUT OF SCOPE for λ Phase 2 and must not appear in any of the seven sub-task implementations:

  • L3 aggregate tokens — defer to Phase 6 π governance (s05 §L3 namespace; aggregation across L2b sets is a governance concern).
  • Quadratic voting math — defer to Phase 6 π governance; λ exposes only the credit-balance read.
  • θ commit-reveal arbiter voting — defer to Phase 3 θ consensus (s09 §Voting); λ exposes only the read-side for the VRF selector.
  • ξ Soul Vector binding — defer to Phase 7 ξ identity; Phase 2 uses raw node_id as the reputation primary key.
  • Cross-fork L3 recomputation — defer to Phase 5 ι fork (s05 §L3 namespace, “non-transferable across forks”).
  • Arbitration write-side (panel selection, escalation, slashing) — defer to Phase 3 θ + Phase 6 π.
  • Reputation transfer or staking — constitutional violation (AX-01 append-only + s04 §Non-transferability). Never implemented.

P2.1.1 — Reputation Record Schema

Spec source: task-breakdown.md §P2.1.1 Concept reference: docs/3-world/social/reputation.md §The five domains Spec docs: s04-reputation.md §Three domains + §Computation (record fields) Worktree: feature/p2-1-1-rep-schema Branch command: git worktree add .worktrees/claude/p2-1-1-rep-schema -b feature/p2-1-1-rep-schema origin/main Estimated effort: S (Small — 1–2 hours) Depends on: P1.1.1 (BPS integer math — schema columns use bigint encoded as INTEGER) Unblocks: P2.1.2 (compute), P2.2.1 (decay), P2.2.2 (penalties), P2.3.1 (tokens)

Files to create

  • src/domains/reputation/schema.ts — TypeScript schema (Zod or hand-rolled) + domain enum + row types
  • src/db/migrations/<NN>-reputation.sql — SQLite migration for reputations + reputation_history tables (next migration number after the most recent file in src/db/migrations/)
  • src/__tests__/domains/reputation/schema.test.ts — Zod validators + migration smoke test

Acceptance criteria

  • 5 domain enum: execution | commissioning | arbitration | governance | social (no other values; sixth-domain attempts must fail validation).
  • reputations table schema columns: node_id TEXT NOT NULL, domain TEXT NOT NULL, score INTEGER NOT NULL DEFAULT 0 (0–10000 bps), scar_bps INTEGER NOT NULL DEFAULT 0 (cumulative scar reduction), ban_until_epoch INTEGER (nullable), last_activity_epoch INTEGER NOT NULL. Primary key (node_id, domain).
  • reputation_history table schema columns: id INTEGER PRIMARY KEY AUTOINCREMENT, node_id TEXT NOT NULL, domain TEXT NOT NULL, epoch INTEGER NOT NULL, delta INTEGER NOT NULL (signed bps), reason TEXT NOT NULL, event_id TEXT NOT NULL. Append-only (no UPDATE or DELETE statements anywhere in src/domains/reputation/**).
  • Indexes: idx_reputations_lookup ON (node_id, domain), idx_reputations_leaderboard ON (domain, score DESC), idx_history_node ON (node_id, domain, epoch DESC).
  • Score constraints: score >= 0 AND score <= 10000 SQL CHECK (per design invariant 5).
  • TypeScript types (Domain, ReputationRow, ReputationHistoryRow) exported from schema.ts with full property typings.
  • Migration applies cleanly to a fresh data/colibri.db; idempotent (running twice does not error).
  • No mutation methods — only selectReputation, selectHistory, and an insertHistoryEvent helper that is the only allowed write path.

Pre-flight reading

  • CLAUDE.md — §3 worktree, §5 gate, §6 5-step, §7 writeback, §13 git auth
  • task-breakdown.md §P2.1.1
  • reputation.md §The five domains + §Penalty schedule + §Phase 0 posture
  • s04-reputation.md §Three domains + §Computation
  • src/domains/rules/bps-constants.ts — DECAY_* / DAMAGE_* / BPS_* constants (read but do not duplicate)
  • src/db/index.ts — better-sqlite3 wrapper + migration runner

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.1.1 — Reputation Record Schema
Ship the 5-domain reputation schema (reputations + reputation_history) as the foundation for all of Phase 2 λ.

FILES TO READ FIRST:
1. CLAUDE.md (worktree §3, gate §5, 5-step chain §6, writeback §7, git auth §13)
2. docs/guides/implementation/task-breakdown.md §P2.1.1
3. docs/3-world/social/reputation.md §The five domains + §Phase 0 posture
4. docs/spec/s04-reputation.md §Three domains + §Computation
5. src/domains/rules/bps-constants.ts (consume DECAY_*, DAMAGE_*, BPS_*)
6. src/db/index.ts (migration runner)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-1-1-rep-schema -b feature/p2-1-1-rep-schema origin/main
cd .worktrees/claude/p2-1-1-rep-schema

FILES TO CREATE:
- src/domains/reputation/schema.ts
  * export type Domain = "execution" | "commissioning" | "arbitration" | "governance" | "social"
  * export const DOMAINS: readonly Domain[] = [...]
  * export interface ReputationRow { node_id: string; domain: Domain; score: number /* 0..10000 bps */; scar_bps: number; ban_until_epoch: number | null; last_activity_epoch: number }
  * export interface ReputationHistoryRow { id: number; node_id: string; domain: Domain; epoch: number; delta: number; reason: string; event_id: string }
  * Zod validators for both row types
  * selectReputation(node_id, domain?) -> ReputationRow | ReputationRow[]
  * selectHistory(node_id, domain, opts: {limit?: number; offset?: number; before_epoch?: number}) -> ReputationHistoryRow[]
  * insertHistoryEvent(row: Omit<ReputationHistoryRow, "id">) -> void   // ONLY allowed write path
  * NO updateReputation / deleteReputation / deleteHistory — those would violate AX-01

- src/db/migrations/<NN>-reputation.sql
  * Pick NN as the next number after the most recent file in src/db/migrations/
  * CREATE TABLE reputations (...) with PRIMARY KEY (node_id, domain) and CHECK (score >= 0 AND score <= 10000)
  * CREATE TABLE reputation_history (...) with id AUTOINCREMENT
  * CREATE INDEX idx_reputations_lookup, idx_reputations_leaderboard, idx_history_node

- src/__tests__/domains/reputation/schema.test.ts
  * Zod validators reject sixth domain ("foo")
  * Zod validators reject score < 0 or > 10000
  * Migration applies cleanly to a fresh in-memory better-sqlite3 db
  * Inserting via the helper appends a row; SELECT verifies presence
  * Confirm no UPDATE / DELETE exports

ACCEPTANCE CRITERIA (headline):
✓ 5-domain enum, no sixth permitted
✓ reputations + reputation_history schemas with PK and CHECK constraints
✓ Three indexes
✓ Migration idempotent
✓ Append-only write helper (no updateReputation / deleteReputation)

SUCCESS CHECK:
cd .worktrees/claude/p2-1-1-rep-schema && npm run build && npm run lint && npm test

WRITEBACK (after success, per CLAUDE.md §7):
task_update(id="P2.1.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.1.1
branch: feature/p2-1-1-rep-schema
worktree: .worktrees/claude/p2-1-1-rep-schema
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Shipped 5-domain reputation schema (reputations + reputation_history) with PK, CHECK constraint, 3 indexes. Append-only write helper; no update / delete exports.
blockers: none")

FORBIDDENS:
✗ No updateReputation / deleteReputation / deleteHistory exports
✗ No floats — score is integer bps
✗ Do not edit main checkout (CLAUDE.md §3)
✗ Do not skip any of build / lint / test (CLAUDE.md §5)

NEXT:
P2.1.2 — Score Computation (consumes selectHistory + insertHistoryEvent, computes domain scores)

Verification checklist (for reviewer agent)

  • Domain enum has exactly 5 members; no extra
  • reputations.score CHECK constraint present
  • No UPDATE reputation* or DELETE FROM reputation* SQL anywhere in src/domains/reputation/**
  • All three indexes created
  • Migration is idempotent (test runs it twice)
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.1.1
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.1.1
    branch: feature/p2-1-1-rep-schema
    worktree: .worktrees/claude/p2-1-1-rep-schema
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: Shipped 5-domain reputation schema with append-only history table. CHECK constraint enforces score bounds; primary key (node_id, domain) prevents duplicate domain rows.
    blockers: none

Common gotchas

  • score INTEGER not REAL — SQLite will silently accept floats unless the CHECK explicitly forbids them. Use typeof(score) = 'integer' in the CHECK, or rely on the Zod validator + integer-only inserts at the TS layer.
  • Migration numbering collisions — list src/db/migrations/ first and pick the next number; do not assume “00X-reputation.sql” is free. If a number is taken, jump to the next.
  • Index on (domain, score DESC) — SQLite supports DESC in indexes since 3.39. Pin the dependency floor in package.json if it isn’t already.
  • Five-domain enum drift — if a downstream test hardcodes 4 domains, it must be updated alongside the migration. Grep for "execution" | "commissioning" patterns and update.

P2.1.2 — Score Computation

Spec source: task-breakdown.md §P2.1.2 Concept reference: reputation.md §The five domains + §Penalty schedule Spec docs: s04-reputation.md §Computation Worktree: feature/p2-1-2-score-compute Branch command: git worktree add .worktrees/claude/p2-1-2-score-compute -b feature/p2-1-2-score-compute origin/main Estimated effort: M (Medium — 4–8 hours) Depends on: P2.1.1 (schema + history helper), P1.3.1 (κ rule engine — for ack_weight evaluation) Unblocks: P2.4.1 (capability gates consume score reads), P2.5.1 (reputation_get returns computed scores)

Files to create

  • src/domains/reputation/compute.ts — Pure score-computation functions
  • src/__tests__/domains/reputation/compute.test.ts — Unit + property tests

Acceptance criteria

  • compute_score(node_id, domain, events: ReputationHistoryRow[]): bigint — Σ(ack_weight × event_outcome) over all events in domain.
  • ack_weight is the acknowledger’s reputation in the same domain, bounded to prevent feedback loops (cap at the acknowledger’s current score and never exceed BPS_100_PERCENT = 10000n).
  • All arithmetic uses src/domains/rules/integer-math.ts helpers (bps_mul, bps_div, safe_mul, safe_div).
  • Score cap: min(10000n - scar_bps, computed) — never exceeds 10000 - scar_bps.
  • Score floor: 0n (clamp negatives — penalties applied separately via P2.2.2).
  • Property test: for any sequence of positive-outcome events, score is monotonically non-decreasing.
  • Property test: compute_score(n, d, events) is deterministic — same input array → byte-identical output.
  • Property test: ack_weight cap holds — no acknowledger ever contributes more than their own current score.
  • No Math.*, no Date.*, no Math.random. Validated by the κ P1.1.2 determinism scanner (extend its globs to src/domains/reputation/** in this PR).

Pre-flight reading

  • CLAUDE.md
  • task-breakdown.md §P2.1.2
  • reputation.md §The five domains
  • s04-reputation.md §Computation
  • src/domains/rules/integer-math.ts (P1.1.1)
  • src/domains/rules/bps-constants.ts (P1.1.3)
  • src/domains/reputation/schema.ts (P2.1.1 — selectHistory)
  • src/__tests__/domains/rules/determinism.test.ts (P1.1.2 — extend its src/domains/rules/** glob)

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.1.2 — Score Computation
Implement compute_score(node_id, domain, events[]) using κ integer-math; enforce ack_weight feedback-loop cap; ensure monotonicity and determinism.

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.1.2
3. docs/3-world/social/reputation.md §The five domains
4. docs/spec/s04-reputation.md §Computation
5. src/domains/rules/integer-math.ts (P1.1.1 helpers)
6. src/domains/rules/bps-constants.ts (P1.1.3 named constants)
7. src/domains/reputation/schema.ts (P2.1.1 row types + selectHistory)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-1-2-score-compute -b feature/p2-1-2-score-compute origin/main
cd .worktrees/claude/p2-1-2-score-compute

FILES TO CREATE:
- src/domains/reputation/compute.ts
  * export function compute_score(
      node_id: string,
      domain: Domain,
      events: ReputationHistoryRow[],
      ack_lookup: (acker_id: string, dom: Domain) => bigint  // current score of the acknowledger
    ): bigint
  * Algorithm:
    1. score := 0n
    2. for each event in events (in epoch ASC order):
       2a. ack := ack_lookup(event.acker_id, domain)  // acker id is encoded in event.reason or event.event_id metadata
       2b. ack_capped := min(ack, BPS_100_PERCENT)    // hard cap to prevent runaway
       2c. weighted := bps_mul(BigInt(event.delta), ack_capped)
       2d. score := score + weighted
    3. score := max(0n, score)                        // floor
    4. score := min(score, BPS_100_PERCENT - scar_bps_for(node_id, domain))  // ceiling
    5. return score
  * All ops via integer-math.ts; no Math.*, no Date.*

- src/__tests__/domains/reputation/compute.test.ts
  * Unit: empty events -> 0n
  * Unit: one positive event with ack_lookup -> 10000n -> score equals event.delta
  * Unit: ack_lookup capped at 10000n even when acker has higher conceptual score
  * Unit: scar_bps reduces ceiling
  * Property (fast-check, 1000 iter): for events with delta > 0, sorting by epoch and folding produces monotonically non-decreasing partial scores
  * Property (1000 iter): determinism — two runs with same inputs produce byte-identical output
  * Extend src/__tests__/domains/rules/determinism.test.ts (or create domains/reputation/determinism.test.ts) to scan src/domains/reputation/**

ACCEPTANCE CRITERIA (headline):
✓ compute_score uses integer-math only
✓ ack_weight feedback-loop cap at BPS_100_PERCENT
✓ Score clamped to [0n, 10000n - scar_bps]
✓ Monotonicity property (positive-only -> non-decreasing)
✓ Determinism scanner extended to src/domains/reputation/**

SUCCESS CHECK:
cd .worktrees/claude/p2-1-2-score-compute && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.1.2", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.1.2
branch: feature/p2-1-2-score-compute
worktree: .worktrees/claude/p2-1-2-score-compute
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Implemented compute_score(node_id, domain, events, ack_lookup) using κ integer-math; ack_weight feedback-loop cap; monotonicity + determinism property tests over 1000 iterations each.
blockers: none")

FORBIDDENS:
✗ No Math.*, no Date.*, no Math.random in src/domains/reputation/**
✗ No floats anywhere — score is bigint
✗ Do not redefine bps_mul / safe_mul — import from integer-math.ts
✗ Do not edit main checkout

NEXT:
P2.4.1 — Capability Gates (consumes compute_score for max_parallel_tasks, can_arbitrate, etc.)

Verification checklist (for reviewer agent)

  • All scoring arithmetic via integer-math.ts (grep import)
  • ack_weight cap applied before multiplication
  • Score clamped to [0n, 10000n - scar_bps]
  • Property tests run 1000 iterations each
  • Determinism scanner covers src/domains/reputation/**
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.1.2
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.1.2
    branch: feature/p2-1-2-score-compute
    worktree: .worktrees/claude/p2-1-2-score-compute
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: Score computation: Σ(bounded_ack_weight × event_outcome). Property tests assert monotonicity under positive-only sequences and byte-identical determinism over 1000 iterations.
    blockers: none

Common gotchas

  • bigint vs numberevent.delta from SQLite arrives as a JS number. Wrap in BigInt(event.delta) at the boundary; never multiply a bigint by a number (throws TypeError).
  • Acknowledger lookup recursion — if the acker’s score is itself derived from this fold, you get infinite recursion. The signature uses ack_lookup as an opaque function; the implementation in P2.5.1 will read the snapshot from the reputations table (not recursively re-fold).
  • Empty events array — return 0n, not null or undefined. Downstream consumers (P2.4.1, P2.5.1) assume a bigint always.
  • Property test fast-check seed — pin a seed so CI flakes are reproducible. Use the same seed-pinning pattern as src/__tests__/domains/rules/determinism.test.ts.

P2.2.1 — Exponential Decay

Spec source: task-breakdown.md §P2.2.1 Concept reference: reputation.md §The five domains (decay rates table) Spec docs: s04-reputation.md §Decay Worktree: feature/p2-2-1-decay Branch command: git worktree add .worktrees/claude/p2-2-1-decay -b feature/p2-2-1-decay origin/main Estimated effort: M (Medium — 4–8 hours) Depends on: P2.1.1 (schema), P1.1.1 (decay() from integer-math) Unblocks: P2.5.1 (reputation_get returns decayed score)

Files to create

  • src/domains/reputation/decay.ts — Per-domain decay application
  • src/__tests__/domains/reputation/decay.test.ts — Unit + batch tests

Acceptance criteria

  • apply_decay(row: ReputationRow, current_epoch: bigint): ReputationRow — pure function; returns new row with decayed score and unchanged last_activity_epoch.
  • Per-domain decay rates (from bps-constants.ts):
    • DECAY_EXECUTION = 500n (execution)
    • DECAY_COMMISSIONING = 300n (commissioning)
    • DECAY_ARBITRATION = 1000n (arbitration)
    • DECAY_GOVERNANCE = 200n (governance)
    • DECAY_SOCIAL = 100n (social)
  • Decay applies only during inactivity: inactive_epochs = max(0, current_epoch - last_activity_epoch).
  • Activity resets timer: last_activity_epoch is updated by the event-write path (P2.1.1 helper) — decay does not modify it.
  • Multi-epoch compound: uses decay(score, rate, epochs) from integer-math.ts (P1.1.1).
  • Score floor at 0 (already enforced by decay() per P1.1.1 acceptance criteria — re-test here).
  • Batch helper apply_decay_batch(rows: ReputationRow[], current_epoch: bigint): ReputationRow[] — pure; efficient for 10,000+ rows (no per-row I/O).
  • Property test: re-applying decay twice (decay → decay) is not equivalent to one decay with double epochs (compound effect tested explicitly).

Pre-flight reading

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.2.1 — Exponential Decay
Implement per-domain reputation decay using the κ integer-math decay() primitive. Pure functions; no DB writes here (those land in P2.5.1's tool surface).

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.2.1
3. docs/3-world/social/reputation.md §The five domains
4. docs/spec/s04-reputation.md §Decay
5. src/domains/rules/integer-math.ts (P1.1.1 decay() helper)
6. src/domains/rules/bps-constants.ts (P1.1.3 DECAY_*)
7. src/domains/reputation/schema.ts (P2.1.1)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-2-1-decay -b feature/p2-2-1-decay origin/main
cd .worktrees/claude/p2-2-1-decay

FILES TO CREATE:
- src/domains/reputation/decay.ts
  * export function rate_for(domain: Domain): bigint
    Returns DECAY_EXECUTION | DECAY_COMMISSIONING | DECAY_ARBITRATION | DECAY_GOVERNANCE | DECAY_SOCIAL.
  * export function apply_decay(row: ReputationRow, current_epoch: bigint): ReputationRow
    1. inactive = current_epoch - BigInt(row.last_activity_epoch)
    2. if inactive <= 0n: return row unchanged
    3. rate = rate_for(row.domain)
    4. new_score = decay(BigInt(row.score), rate, inactive)  // from integer-math.ts
    5. return { ...row, score: Number(new_score) }
  * export function apply_decay_batch(rows: ReputationRow[], current_epoch: bigint): ReputationRow[]
    Pure map over apply_decay; no side effects.

- src/__tests__/domains/reputation/decay.test.ts
  * Unit: row at epoch 100, current 100 -> unchanged
  * Unit: row at epoch 100, current 110, execution domain -> decay by 500bps × 10 epochs
  * Unit: row with score 0 stays at 0 (floor)
  * Property: compound effect — decay(decay(x, r, e1), r, e2) != decay(x, r, e1+e2) because per-step floor compounds
  * Batch test: 10,000 rows decay in under 50ms on dev hardware (smoke perf check)
  * Determinism: same input -> same output across two runs

ACCEPTANCE CRITERIA (headline):
✓ Per-domain decay via rate_for(domain)
✓ apply_decay pure; never mutates input
✓ Inactivity-only (epoch >= last_activity)
✓ Score floored at 0
✓ Batch helper for 10k+ rows

SUCCESS CHECK:
cd .worktrees/claude/p2-2-1-decay && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.2.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.2.1
branch: feature/p2-2-1-decay
worktree: .worktrees/claude/p2-2-1-decay
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Per-domain reputation decay using integer-math decay() + DECAY_* constants. apply_decay (single row) and apply_decay_batch (10k+ rows pure). Compound effect verified property-style.
blockers: none")

FORBIDDENS:
✗ Do not write to DB in decay.ts — pure functions only
✗ Do not redefine decay() — import from integer-math.ts
✗ Do not edit last_activity_epoch in decay (that's the event-write path's job)
✗ Do not skip build / lint / test

NEXT:
P2.5.1 — Reputation Query MCP Tools (composes compute + decay + penalties + tokens + limits)

Verification checklist (for reviewer agent)

  • rate_for covers all 5 domains; exhaustive switch with TS never check
  • apply_decay does not mutate input (test with Object.freeze)
  • Batch helper has no per-row I/O
  • Floor at 0 verified explicitly
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.2.1
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.2.1
    branch: feature/p2-2-1-decay
    worktree: .worktrees/claude/p2-2-1-decay
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: Per-domain reputation decay (DECAY_EXECUTION 500bps/epoch ... DECAY_SOCIAL 100bps/epoch). apply_decay pure single-row; apply_decay_batch pure 10k+ rows. Floor at 0 enforced.
    blockers: none

Common gotchas

  • Last-activity update belongs elsewhere — decay never touches last_activity_epoch; that field updates only when an event is inserted into reputation_history. Test enforces this.
  • Object.freeze is shallow — if you freeze a row and then mutate row.score, JS silently no-ops. Always use spread ({...row, score: ...}).
  • Batch perf without I/O — the 10k row smoke test must not import better-sqlite3; that’s a P2.5.1 concern. Build the row array in-memory.
  • Number ↔ bigint at boundary — schema stores INTEGER; TS reads it as number. Convert at the function boundary: BigInt(row.score) going in, Number(new_score) coming out. Document the precision risk (score is 0–10000, well within Number range).

P2.2.2 — Offense Penalties

Spec source: task-breakdown.md §P2.2.2 Concept reference: reputation.md §Penalty schedule Spec docs: s04-reputation.md §Damage table + §Permanent scars; s05-experience-tokens.md §Decay and scar supersession; s09-arbitration.md §Arbiter constraints (overturned-decision row only) Worktree: feature/p2-2-2-penalties Branch command: git worktree add .worktrees/claude/p2-2-2-penalties -b feature/p2-2-2-penalties origin/main Estimated effort: M (Medium — 4–8 hours) Depends on: P2.1.1 (schema), P1.1.1 (apply_bps from integer-math), P1.1.3 (DAMAGE_* constants) Unblocks: P2.5.1 (reputation_get exposes scar_bps and ban_until)

Files to create

  • src/domains/reputation/penalties.ts — Severity band → penalty application
  • src/__tests__/domains/reputation/penalties.test.ts — Per-band unit tests + double-jeopardy property

Acceptance criteria

  • Severity band enum (from reputation.md §Penalty schedule + task-breakdown.md §P2.2.2):
    • Minor → DAMAGE_MINOR (1500 bps)
    • Moderate → DAMAGE_MODERATE (3000 bps)
    • Severe → DAMAGE_SEVERE (5000 bps)
    • Critical → DAMAGE_CRITICAL (8000 bps) + ban
    • Fraud → DAMAGE_FRAUD (10000 bps) + ban + scar
  • apply_penalty(row, band, current_epoch, event_id): { row, history_event } — returns updated row + the history event to insert via P2.1.1’s append-only helper.
  • Scar mechanism: Fraud band adds DAMAGE_FRAUD = 10000n to scar_bps (capped at 10000n — score can never recover beyond 0% if fraud reaches absolute ceiling). Per s04 §Permanent scars: max achievable reputation capped at 10000 - scar_bps.
  • Ban mechanism: Critical/Fraud bands set ban_until_epoch = current_epoch + BAN_DURATION_EPOCHS (BAN_DURATION_EPOCHS exported as a constant; recommend 100 epochs as a starting governance parameter).
  • Double-jeopardy guard: apply_penalty rejects (throws or returns { row: unchanged }) when the (event_id, band) tuple already exists in reputation_history. Caller responsible for the lookup; helper exposes is_double_penalty(event_id, band, history): boolean.
  • Recovery path: after ban_until_epoch passes, node resumes at scar-limited maximum (no additional logic — the gate is read-only on ban_until_epoch < current_epoch).
  • All BPS math via integer-math.ts (apply_bps).
  • No deletion of history; penalties append a new row with negative delta.

Pre-flight reading

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.2.2 — Offense Penalties
Implement the 5-band penalty system (Minor / Moderate / Severe / Critical / Fraud) with scar + ban mechanisms and double-jeopardy guard.

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.2.2
3. docs/3-world/social/reputation.md §Penalty schedule
4. docs/spec/s04-reputation.md §Damage table + §Permanent scars
5. docs/spec/s05-experience-tokens.md §Decay and scar supersession
6. docs/spec/s09-arbitration.md §Arbiter constraints
7. src/domains/rules/integer-math.ts (apply_bps)
8. src/domains/rules/bps-constants.ts (DAMAGE_*)
9. src/domains/reputation/schema.ts (P2.1.1)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-2-2-penalties -b feature/p2-2-2-penalties origin/main
cd .worktrees/claude/p2-2-2-penalties

FILES TO CREATE:
- src/domains/reputation/penalties.ts
  * export type SeverityBand = "minor" | "moderate" | "severe" | "critical" | "fraud"
  * export const BAN_DURATION_EPOCHS = 100n  // governance parameter
  * export function damage_for(band: SeverityBand): bigint
    Returns DAMAGE_MINOR ... DAMAGE_FRAUD.
  * export function is_double_penalty(event_id: string, band: SeverityBand, history: ReputationHistoryRow[]): boolean
  * export function apply_penalty(
      row: ReputationRow,
      band: SeverityBand,
      current_epoch: bigint,
      event_id: string,
      reason: string
    ): { row: ReputationRow; history_event: Omit<ReputationHistoryRow, "id"> }
    1. damage = damage_for(band)
    2. new_score = max(0n, BigInt(row.score) - apply_bps(BigInt(row.score), damage))   // apply_bps -> (val * bps / 10000)
    3. new_scar_bps = (band === "fraud") ? min(10000n, BigInt(row.scar_bps) + DAMAGE_FRAUD) : BigInt(row.scar_bps)
    4. new_ban_until = (band === "critical" || band === "fraud")
                     ? current_epoch + BAN_DURATION_EPOCHS
                     : (row.ban_until_epoch ?? null)
    5. row_out = { ...row, score: Number(new_score), scar_bps: Number(new_scar_bps), ban_until_epoch: ... }
    6. history_event = { node_id: row.node_id, domain: row.domain, epoch: Number(current_epoch), delta: -Number(apply_bps(BigInt(row.score), damage)), reason, event_id }
    7. return { row: row_out, history_event }

- src/__tests__/domains/reputation/penalties.test.ts
  * Per-band unit: each of {minor, moderate, severe, critical, fraud} produces expected delta
  * Scar: fraud adds 10000n to scar_bps; clamped at 10000n max
  * Ban: critical/fraud sets ban_until_epoch = current + 100
  * Non-critical bands leave ban_until_epoch unchanged
  * Double-jeopardy: is_double_penalty returns true when (event_id, band) already in history
  * Floor: applying minor to score=0 stays at 0
  * Append-only: penalty produces a history_event for caller to insert (no DB write inside apply_penalty)

ACCEPTANCE CRITERIA (headline):
✓ 5 severity bands wired to DAMAGE_*
✓ Scar appends only on fraud, capped at 10000n
✓ Ban only on critical/fraud
✓ Double-jeopardy guard helper
✓ Pure function — caller does the DB insert

SUCCESS CHECK:
cd .worktrees/claude/p2-2-2-penalties && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.2.2", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.2.2
branch: feature/p2-2-2-penalties
worktree: .worktrees/claude/p2-2-2-penalties
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: 5-band penalty system (minor/moderate/severe/critical/fraud) wired to DAMAGE_* constants. Scar on fraud (permanent ceiling cut). Ban on critical/fraud (100 epoch governance parameter). Double-jeopardy guard. Pure apply_penalty returns row + history_event for caller insert.
blockers: none")

FORBIDDENS:
✗ Do not DELETE or UPDATE reputation_history rows — append-only
✗ Do not subtract negative damage to score — damage is always positive bps
✗ Do not write to DB inside apply_penalty
✗ Do not edit main checkout

NEXT:
P2.5.1 — Reputation Query MCP Tools (composes the apply_penalty output with reputation_get + reputation_history reads)

Verification checklist (for reviewer agent)

  • All 5 bands have unit-test coverage
  • Scar is additive and clamped at 10000n
  • Ban set only for critical/fraud
  • No DB writes inside apply_penalty (grep for db.prepare / db.exec)
  • Double-jeopardy guard helper present and tested
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.2.2
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.2.2
    branch: feature/p2-2-2-penalties
    worktree: .worktrees/claude/p2-2-2-penalties
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: 5-band penalty system (minor 1500 → fraud 10000 bps). Scar on fraud (permanent ceiling reduction). Ban on critical/fraud (100 epochs). Double-jeopardy guard. apply_penalty is pure; caller does the history append.
    blockers: none

Common gotchas

  • Damage vs delta signdamage_for returns positive bps; the history_event.delta is negative (penalty reduces score). Get the sign right at the boundary.
  • Scar accumulation vs cap — sequential fraud penalties keep adding to scar_bps but it’s clamped at 10000n — at full scar, the node’s score ceiling is 0 forever. Test this explicitly.
  • Reason string sourcereason should match a κ denial-reason taxonomy entry (R87 P1.4.2). E.g. "REP_FRAUD_PROVEN". Reference the κ taxonomy file; do not invent new reason strings.
  • Ban duration as a governance parameterBAN_DURATION_EPOCHS = 100n is a starting value. Phase 6 π governance will tune via rule upgrade. Document this in the comment block.

P2.3.1 — Experience Tokens (L0–L2b)

Spec source: task-breakdown.md §P2.3.1 Concept reference: reputation.md §Experience tokens Spec docs: s05-experience-tokens.md §Token levels + §Promotion flow + §Witness registry + §Pattern-matching algorithm Worktree: feature/p2-3-1-tokens Branch command: git worktree add .worktrees/claude/p2-3-1-tokens -b feature/p2-3-1-tokens origin/main Estimated effort: L (Large — 8–16 hours) Depends on: P2.1.1 (schema; reuses migration pattern), P1.1.1 (bigint encoding) Unblocks: P2.5.1 (reputation_get exposes token counts per domain)

Files to create

  • src/domains/reputation/tokens.ts — Token issuance + promotion engine
  • src/domains/reputation/witnesses.ts — Witness registry CRUD + independence rule
  • src/db/migrations/<NN>-experience-tokens.sqlexperience_tokens + mcp_witnesses tables
  • src/__tests__/domains/reputation/tokens.test.ts
  • src/__tests__/domains/reputation/witnesses.test.ts

Acceptance criteria

  • 5 token levels: L0 | L1 | L1.5 | L2a | L2b (no L3 — deferred to Phase 6).
  • experience_tokens table with columns: id TEXT PRIMARY KEY (ULID like tok_01HXYZ...), node_id TEXT NOT NULL, level TEXT NOT NULL CHECK (level IN ('L0','L1','L1.5','L2a','L2b')), domain TEXT NOT NULL, scenario TEXT, counterparty TEXT, action TEXT NOT NULL, outcome_class TEXT NOT NULL, outcome_delta INTEGER NOT NULL, witnesses TEXT (JSON array of witness_ids), created_at INTEGER NOT NULL, promoted_from TEXT (chain to upstream token), feature_hash TEXT (SHA-256 for L2a+ matching).
  • mcp_witnesses table with columns: witness_id TEXT PRIMARY KEY, agent_id TEXT NOT NULL, target_node_id TEXT NOT NULL, target_episode_id TEXT NOT NULL, reputation_at_witness INTEGER NOT NULL (frozen; see s05), weight_cap INTEGER NOT NULL (≤ 30, expressed as bps × 100 to stay integer), counterparty_class TEXT NOT NULL, created_at INTEGER NOT NULL.
  • L0 auto-mint on event completion: mint_L0(node_id, domain, action, outcome): Token.
  • L0 → L1 promotion requires interaction cycle complete (all GSD FSM phases per β P0.3.1; counterparty confirms delivery). promote_to_L1(L0_token, cycle_proof): Token.
  • L1 → L1.5 promotion requires ≥1 witness; witness rules per s05 §Witness registry:
    • reputation_at_witness ≥ 200 (frozen at witness time)
    • per-witness weight_cap ≤ 0.3 (encoded as integer 30 since bigint math; multiply by 100 → 3000 for bps×100 representation)
    • sum cap Σ weights ≤ 0.4 × MIN_EPISODES
    • independence: ≤1 witness per counterparty class per rolling 7-day window
  • L1/L1.5 → L2a promotion requires ≥5 tokens with same feature_hash, spanning ≥3 distinct scenario values AND ≥3 distinct counterparty classes (diversity gate).
  • L2a → L2b promotion requires invariance check: replay token’s outcome via κ engine with context zeroed; outcome_class must match original.
  • Pattern matching: feature_hash = SHA-256(canonical_context || action_type || outcome_class). canonical_context is JSON-stringified with sorted keys; unknown values bucketed to *.
  • Append-only: no UPDATE or DELETE on experience_tokens or mcp_witnesses. Promotion creates a new row with promoted_from referencing the upstream.
  • Non-transferable: no helper allows changing node_id on an existing token row.
  • Token-counts query: count_tokens_by_level(node_id, domain, level): number.

Pre-flight reading

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.3.1 — Experience Tokens (L0–L2b)
Ship the full token issuance + promotion pipeline per s05. Includes witness registry with independence rule and SHA-256 feature hashing for L2a pattern matching.

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.3.1
3. docs/3-world/social/reputation.md §Experience tokens
4. docs/spec/s05-experience-tokens.md (FULL READ — every subsection: Token levels, Promotion flow, Witness registry, Pattern-matching algorithm, L3 namespace [defer; do not implement], Decay and scar supersession, Append-only vs garbage collection)
5. src/domains/reputation/schema.ts (P2.1.1 — migration pattern)
6. src/domains/rules/engine.ts (P1.3.1 — for L2b invariance replay)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-3-1-tokens -b feature/p2-3-1-tokens origin/main
cd .worktrees/claude/p2-3-1-tokens

FILES TO CREATE:
- src/domains/reputation/tokens.ts
  * export type TokenLevel = "L0" | "L1" | "L1.5" | "L2a" | "L2b"
  * export interface Token {
      id: string;          // tok_01HXYZ... (ULID)
      node_id: string;
      level: TokenLevel;
      domain: Domain;
      scenario: string | null;
      counterparty: string | null;
      action: string;
      outcome_class: string;
      outcome_delta: number;
      witnesses: string[];  // JSON-stored
      created_at: number;
      promoted_from: string | null;
      feature_hash: string | null;  // SHA-256 hex; null for L0/L1
    }
  * mint_L0(node_id, domain, action, outcome): Token            // auto-mint on event
  * promote_to_L1(l0: Token, cycle_proof: CycleProof): Token   // requires full FSM cycle
  * promote_to_L1_5(l1: Token, witnesses: Witness[]): Token    // witness rules enforced
  * promote_to_L2a(tokens: Token[]): Token                      // 5+ same feature_hash, diversity gate
  * promote_to_L2b(l2a: Token, engine: RuleEngine): Token       // invariance replay via κ
  * count_tokens_by_level(node_id, domain, level): number
  * feature_hash(context, action_type, outcome_class): string   // SHA-256 hex
  * canonicalize_context(ctx): object                            // sorted keys, * for unknowns
  * NO updateToken / deleteToken exports — append-only

- src/domains/reputation/witnesses.ts
  * export interface Witness {
      witness_id: string;
      agent_id: string;
      target_node_id: string;
      target_episode_id: string;
      reputation_at_witness: number;
      weight_cap: number;        // bps × 100, ≤ 30
      counterparty_class: string;
      created_at: number;
    }
  * register_witness(input): Witness                            // floor check + weight cap + independence rule
  * check_independence(target, counterparty_class, now): boolean  // max 1 per class per 7-day window
  * total_witness_weight(target_episode_id): number             // sum cap ≤ 0.4 × MIN_EPISODES

- src/db/migrations/<NN>-experience-tokens.sql
  * CREATE TABLE experience_tokens (...) with CHECK on level enum
  * CREATE TABLE mcp_witnesses (...) with reputation_at_witness ≥ 200 CHECK
  * Indexes: (node_id, domain, level), (feature_hash), (target_node_id, counterparty_class, created_at)

- src/__tests__/domains/reputation/tokens.test.ts
  * L0 mint: every completed event produces a token
  * L1 promotion: requires cycle_proof valid; absent proof → stays at L0
  * L1.5 promotion: witness floor enforced (rep < 200 rejected)
  * L1.5 weight cap: per-witness ≤ 0.3 (encoded as 30 in bps×100)
  * L2a promotion: requires ≥5 same-feature_hash tokens, ≥3 scenarios, ≥3 counterparty classes
  * L2a diversity gate: 4 tokens or 2 scenarios or 2 counterparties → rejected
  * L2b invariance: pass + fail cases (context-removed outcome matches / differs)
  * Feature hash determinism: same input → same SHA-256 across runs
  * Append-only: no exports allow updates or deletes (TS-level)
  * Non-transferable: no helper changes node_id

- src/__tests__/domains/reputation/witnesses.test.ts
  * register_witness: rep<200 rejected
  * register_witness: weight_cap>30 rejected
  * check_independence: same counterparty class within 7 days rejected
  * check_independence: same class >7 days apart accepted
  * total_witness_weight: sum cap enforced at registration time

ACCEPTANCE CRITERIA (headline):
✓ 5 token levels (L0/L1/L1.5/L2a/L2b)
✓ SHA-256 feature hash with canonical context
✓ Witness registry with floor + per-cap + sum-cap + 7-day independence
✓ L2a diversity gate (≥5, ≥3 scenarios, ≥3 counterparties)
✓ L2b invariance via κ engine replay
✓ Append-only; non-transferable

SUCCESS CHECK:
cd .worktrees/claude/p2-3-1-tokens && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.3.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.3.1
branch: feature/p2-3-1-tokens
worktree: .worktrees/claude/p2-3-1-tokens
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Experience token pipeline (L0→L1→L1.5→L2a→L2b) with SHA-256 feature hashing, witness registry (floor 200, per-cap 0.3, sum-cap 0.4×MIN_EPISODES, 7-day independence), diversity gate, and κ invariance replay for L2b. L3 deferred to Phase 6. Append-only and non-transferable enforced at TS + SQL layers.
blockers: none")

FORBIDDENS:
✗ Do not ship L3 — that's Phase 6 π governance
✗ Do not allow updateToken / deleteToken — AX-01 violation
✗ Do not allow node_id change on a token — non-transferability
✗ Do not use Math.random for token IDs — use crypto.randomUUID or ULID with seeded entropy
✗ Do not skip CLAUDE.md gates

NEXT:
P2.4.1 — Capability Gates (consumes count_tokens_by_level for tier unlocks)

Verification checklist (for reviewer agent)

  • No L3 anywhere
  • No updateToken / deleteToken / node_id mutation helpers
  • SHA-256 used (not other hash)
  • Witness reputation floor and 7-day independence both tested
  • L2a diversity gate (3 scenarios × 3 counterparties) tested
  • L2b invariance replay calls into src/domains/rules/engine.ts
  • Both migrations apply cleanly

Writeback template

task_update:
  task_id: P2.3.1
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.3.1
    branch: feature/p2-3-1-tokens
    worktree: .worktrees/claude/p2-3-1-tokens
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: L0–L2b experience tokens shipped (L3 deferred per spec). Witness registry with floor 200, per-cap 0.3, sum-cap 0.4×MIN_EPISODES, 7-day independence rule. Pattern matching via SHA-256 feature_hash; diversity gate (≥3 scenarios × ≥3 counterparties); L2b invariance via κ engine replay. Append-only enforced via no-mutation exports + SQL.
    blockers: none

Common gotchas

  • L3 temptation — s05 §L3 namespace describes the L3 aggregation; it is explicitly out of scope for Phase 2 (audit §3 row 3). Do not add a promote_to_L3 even if it seems like a small step. Phase 6 will define it against the governance rule weights.
  • Canonical-context JSON orderingJSON.stringify is unstable; use a sorted-key serializer (canonical-json library or hand-rolled). The feature hash must be byte-identical across runs.
  • ULID vs UUID — s05 uses ULID prefixes (tok_01HXYZ...); pick a deterministic-but-monotonic ID generator. crypto.randomUUID is fine if you prefix manually; or import ulid (small dep). Do not use Math.random.
  • Witness weight encoding — s05 says weight_cap ≤ 0.3. Float-free representation: store as integer 30 and document the unit (bps × 100 — 30 ⇒ 0.3). Update total_witness_weight to apply the same scaling.
  • MIN_EPISODES — s05 names this constant but does not define it. Define MIN_EPISODES = 5n in bps-constants.ts (extend file if needed) and document in PR.

P2.4.1 — Capability Gates (derived limits)

Spec source: task-breakdown.md §P2.4.1 Concept reference: reputation.md §Derived limits Spec docs: s04-reputation.md §Derived limits; s09-arbitration.md §Arbiter selection (eligibility) Worktree: feature/p2-4-1-limits Branch command: git worktree add .worktrees/claude/p2-4-1-limits -b feature/p2-4-1-limits origin/main Estimated effort: M (Medium — 4–8 hours) Depends on: P2.1.2 (score reads), P1.3.2 (κ built-ins — sqrt_floor, log2_floor) Unblocks: P2.5.1 (reputation_check_gates exposes all derived limits)

Files to create

  • src/domains/reputation/limits.ts — Pure derivation from current score row
  • src/__tests__/domains/reputation/limits.test.ts — Boundary tests per derivation

Acceptance criteria

  • max_parallel_tasks(rep: ReputationRow): bigint = min(sqrt_floor(execution_rep), 20n) per s04 §Derived limits. Uses sqrt_floor from src/domains/rules/builtins.ts (P1.3.2).
  • rate_limit_bonus(rep: ReputationRow, base_rate: bigint): bigint = bps_mul(base_rate, log2_floor(max(execution_rep, 1n))) per s04. Uses log2_floor from κ built-ins.
  • stake_discount(required_stake: bigint, rep: ReputationRow): bigint = safe_div(safe_mul(required_stake, BPS_100_PERCENT), max(execution_rep, 1000n)) per task-breakdown.md §P2.4.1.
  • can_arbitrate(rep_arbitration: ReputationRow, rep_execution: ReputationRow): boolean = arbitration_score >= 5000n AND execution_score >= 3000n per task-breakdown.md §P2.4.1.
  • can_govern(rep_governance: ReputationRow): boolean = governance_score >= 4000n.
  • Banned nodes: can_arbitrate and can_govern return false when ban_until_epoch > current_epoch. Pure function; takes current_epoch as a parameter (no Date.now).
  • All BPS math via integer-math.ts; all built-ins via src/domains/rules/builtins.ts.
  • Each derivation has 3 boundary unit tests: zero, threshold, above-threshold.

Pre-flight reading

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.4.1 — Capability Gates (derived limits)
Ship the pure-function derivation layer: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern. All BPS math; all use κ built-ins for sqrt and log2.

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.4.1
3. docs/3-world/social/reputation.md §Derived limits
4. docs/spec/s04-reputation.md §Derived limits
5. docs/spec/s09-arbitration.md §Arbiter selection
6. src/domains/rules/builtins.ts (P1.3.2 — sqrt_floor, log2_floor)
7. src/domains/rules/integer-math.ts (P1.1.1)
8. src/domains/reputation/schema.ts (P2.1.1)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-4-1-limits -b feature/p2-4-1-limits origin/main
cd .worktrees/claude/p2-4-1-limits

FILES TO CREATE:
- src/domains/reputation/limits.ts
  * export function max_parallel_tasks(rep_execution: ReputationRow): bigint
    return min(sqrt_floor(BigInt(rep_execution.score)), 20n)
  * export function rate_limit_bonus(rep_execution: ReputationRow, base_rate: bigint): bigint
    return bps_mul(base_rate, log2_floor(max(BigInt(rep_execution.score), 1n)))
  * export function stake_discount(required_stake: bigint, rep_execution: ReputationRow): bigint
    return safe_div(safe_mul(required_stake, BPS_100_PERCENT), max(BigInt(rep_execution.score), 1000n))
  * export function can_arbitrate(rep_arbitration: ReputationRow, rep_execution: ReputationRow, current_epoch: bigint): boolean
    1. if rep_arbitration.ban_until_epoch && BigInt(rep_arbitration.ban_until_epoch) > current_epoch: return false
    2. return BigInt(rep_arbitration.score) >= 5000n && BigInt(rep_execution.score) >= 3000n
  * export function can_govern(rep_governance: ReputationRow, current_epoch: bigint): boolean
    1. if rep_governance.ban_until_epoch && BigInt(rep_governance.ban_until_epoch) > current_epoch: return false
    2. return BigInt(rep_governance.score) >= 4000n

- src/__tests__/domains/reputation/limits.test.ts
  * max_parallel_tasks: rep=0 -> 0n; rep=10000 -> sqrt_floor(10000)=100, capped at 20 -> 20n
  * max_parallel_tasks: rep=400 -> sqrt_floor(400)=20 -> 20n (exact cap)
  * max_parallel_tasks: rep=399 -> sqrt_floor(399)=19 -> 19n (just below cap)
  * rate_limit_bonus: rep=1 -> log2_floor(1)=0 -> 0n; rep=1024 -> log2_floor(1024)=10
  * stake_discount: rep=10000 -> stake / 10 (discount kicks in); rep=1000 (floor) -> stake × 10
  * can_arbitrate: arb=4999 -> false; arb=5000 + exec=2999 -> false; arb=5000 + exec=3000 -> true
  * can_arbitrate: ban_until_epoch > current -> false even if scores qualify
  * can_govern: gov=4000 -> true; gov=3999 -> false; gov=4000 + ban -> false

ACCEPTANCE CRITERIA (headline):
✓ 5 derivations: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern
✓ All use κ built-ins (sqrt_floor, log2_floor) and integer-math
✓ Ban check (ban_until_epoch > current_epoch) on the two boolean gates
✓ 3 boundary tests per derivation
✓ No Date.now, no Math.*

SUCCESS CHECK:
cd .worktrees/claude/p2-4-1-limits && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.4.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.4.1
branch: feature/p2-4-1-limits
worktree: .worktrees/claude/p2-4-1-limits
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Derived-limits layer: max_parallel_tasks (sqrt_floor cap 20), rate_limit_bonus (log2_floor × base_rate), stake_discount (BPS × inverse rep with floor 1000), can_arbitrate (≥5000 arb + ≥3000 exec + not banned), can_govern (≥4000 + not banned). All κ-built-in derived; pure functions; current_epoch parameterized.
blockers: none")

FORBIDDENS:
✗ Do not call sqrt() from Math — use sqrt_floor from κ builtins
✗ Do not call Date.now — current_epoch is a parameter
✗ Do not write to DB
✗ Do not edit main checkout

NEXT:
P2.5.1 — Reputation Query MCP Tools (composes all 5 derivations into reputation_check_gates)

Verification checklist (for reviewer agent)

  • All 5 derivations exported and tested at 3 boundaries each
  • Math.sqrt / Math.log not used anywhere
  • Date.now not used anywhere
  • Ban check applied on can_arbitrate and can_govern
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.4.1
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.4.1
    branch: feature/p2-4-1-limits
    worktree: .worktrees/claude/p2-4-1-limits
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: Derived-limit pure functions: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern. All BPS math; all via κ built-ins (sqrt_floor, log2_floor). Ban gate enforced on booleans.
    blockers: none

Common gotchas

  • sqrt_floor input type — κ P1.3.2 expects bigint; convert row.score to bigint at the boundary. The output is also bigint; the consumer (P2.5.1 tool) converts to number at the JSON boundary.
  • log2_floor(0) — undefined for 0; the spec says max(rep, 1) for this reason. Test the boundary explicitly.
  • stake_discount floor at rep=1000 — this is to prevent runaway discounts at near-zero reputation. Do not change the floor without amending task-breakdown.md.
  • Ban check semanticsban_until_epoch is the first epoch the ban is over; the ban is active when ban_until_epoch > current_epoch. Off-by-one is easy here; test both sides explicitly.

P2.5.1 — Reputation Query MCP Tools

Spec source: task-breakdown.md §P2.5.1 Concept reference: reputation.md §Phase 0 posture (read-only tool surface) Spec docs: s04-reputation.md §Computation (read-side surface) Worktree: feature/p2-5-1-tools Branch command: git worktree add .worktrees/claude/p2-5-1-tools -b feature/p2-5-1-tools origin/main Estimated effort: S (Small — 1–2 hours; integration glue + 4 thin tool wrappers) Depends on: P2.1.2 (score read), P2.2.1 (decay applied at read time), P2.2.2 (scar / ban exposed), P2.3.1 (token counts), P2.4.1 (gate booleans), P0.3.4 (ε tool registration) Unblocks: closes λ Phase 2 (7/7)

Files to create

  • src/domains/reputation/tools.ts — 4 MCP tool definitions
  • src/__tests__/domains/reputation/tools.test.ts — Integration tests (create node → apply events → verify tool output)

Acceptance criteria

  • reputation_get(node_id: string, domain?: Domain):
    • Returns { domain, score, scar_bps, ban_until_epoch, last_activity_epoch } for the specified domain (or array of all 5 if domain omitted).
    • Applies decay (P2.2.1) lazily before returning — score reflects current epoch.
    • Score values returned as numbers (bps); ban as nullable number.
  • reputation_history(node_id: string, domain: Domain, limit?: number, offset?: number):
    • Paginated history events; ordered by epoch DESC, then id DESC.
    • Default limit = 50, max limit = 500.
  • reputation_leaderboard(domain: Domain, limit?: number):
    • Top N nodes by current (decayed) score in domain.
    • Default limit = 100, max limit = 1000.
  • reputation_check_gates(node_id: string, current_epoch: number):
    • Returns { can_arbitrate, can_govern, max_parallel_tasks, rate_limit_bonus_factor, effective_stake_bps }.
    • Combines reads across all 5 domain rows for the node.
  • All 4 tools registered as MCP tools via ε Skill Registry (P0.6.x); registration glue in src/server.ts if needed, matching the κ admission tool registration pattern (R87 P1.4.1).
  • Integration test: Create a node → write 5 positive events at varying epochs → apply decay across 10 epochs → query reputation_get → assert score matches hand-calculated value within 1 bps.
  • Zod schemas for inputs and outputs.
  • Idempotent reads: no tool mutates the DB; reads do not advance last_activity_epoch.

Pre-flight reading

  • CLAUDE.md
  • task-breakdown.md §P2.5.1
  • reputation.md §Phase 0 posture
  • s04-reputation.md §Computation
  • src/domains/reputation/compute.ts (P2.1.2)
  • src/domains/reputation/decay.ts (P2.2.1)
  • src/domains/reputation/penalties.ts (P2.2.2)
  • src/domains/reputation/tokens.ts (P2.3.1)
  • src/domains/reputation/limits.ts (P2.4.1)
  • src/domains/skills/ (P0.6.x — tool registration)
  • src/server.ts (registration pattern from κ P1.4.1 admission tools)

Ready-to-paste agent prompt

You are a Phase 2 builder agent for Colibri (λ Reputation).

TASK: P2.5.1 — Reputation Query MCP Tools
Final λ Phase 2 sub-task. Wire the 4 read-only tools that compose all prior λ outputs into the MCP surface.

FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.5.1
3. docs/3-world/social/reputation.md §Phase 0 posture
4. docs/spec/s04-reputation.md §Computation
5. src/domains/reputation/{schema,compute,decay,penalties,tokens,limits}.ts (all upstream P2 outputs)
6. src/domains/skills/ (P0.6.x — tool registration glue)
7. src/server.ts (registration pattern from κ P1.4.1 admission tools)

WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-5-1-tools -b feature/p2-5-1-tools origin/main
cd .worktrees/claude/p2-5-1-tools

FILES TO CREATE:
- src/domains/reputation/tools.ts
  * Tool 1: reputation_get
    - Input Zod: { node_id: string; domain?: Domain }
    - Implementation: selectReputation(node_id, domain), apply_decay to each row, return projection
  * Tool 2: reputation_history
    - Input Zod: { node_id: string; domain: Domain; limit?: number (default 50, max 500); offset?: number }
    - Implementation: selectHistory(node_id, domain, opts)
  * Tool 3: reputation_leaderboard
    - Input Zod: { domain: Domain; limit?: number (default 100, max 1000) }
    - Implementation: SELECT * FROM reputations WHERE domain = ? ORDER BY score DESC LIMIT ?
      (decay must be applied — since pre-computed decay would require a batch job, the
       Phase 2 implementation applies decay in-memory after the SELECT; later phases may
       materialize a decay job)
  * Tool 4: reputation_check_gates
    - Input Zod: { node_id: string; current_epoch: number }
    - Implementation: read all 5 domain rows, call max_parallel_tasks/can_arbitrate/can_govern/rate_limit_bonus/stake_discount
  * All 4 register via the skill registry (P0.6.x); registration happens at server boot per κ pattern in src/server.ts
  * NO mutation tools — Phase 2 is read-only at the MCP surface

- src/__tests__/domains/reputation/tools.test.ts
  * Integration: insert 5 history events at epochs 100–104 with deltas +1000, +500, +200, +800, +1500
    → reputation_get at epoch 104 → score matches hand-calc
    → reputation_get at epoch 200 → score reflects 96 epochs of execution-domain decay (500bps/epoch)
  * reputation_history: 100 events → page 1 returns 50 ordered DESC, page 2 returns next 50
  * reputation_leaderboard: 10 nodes → top 3 returned in DESC score order
  * reputation_check_gates: known scores → gates match P2.4.1 derivations
  * No tool mutates DB: assert reputation_get does not change reputations or reputation_history
  * Zod rejects: invalid domain, limit > max, negative offset

ACCEPTANCE CRITERIA (headline):
✓ 4 tools registered as MCP tools via ε
✓ Decay applied lazily on every read
✓ Zod input/output schemas
✓ Limit clamping (max enforced)
✓ Integration test: write events → read score → hand-calc match
✓ Read-only — no DB mutation in any of the 4 tools

SUCCESS CHECK:
cd .worktrees/claude/p2-5-1-tools && npm run build && npm run lint && npm test

WRITEBACK (after success):
task_update(id="P2.5.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
  content="task_id: P2.5.1
branch: feature/p2-5-1-tools
worktree: .worktrees/claude/p2-5-1-tools
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: 4 read-only MCP tools (reputation_get, reputation_history, reputation_leaderboard, reputation_check_gates) registered via ε. Decay applied lazily on every read. Integration test: hand-calc score matches across 96-epoch decay span. Closes λ Phase 2 at 7/7.
blockers: none")

FORBIDDENS:
✗ Do not add mutation tools — Phase 2 is read-only at the MCP surface
✗ Do not use Date.now — current_epoch is a Zod-validated input
✗ Do not skip decay on read — score must reflect current_epoch
✗ Do not edit main checkout

NEXT:
λ Phase 2 closes at 7/7. PM updates memory file with λ status partial,
then opens the colibri_code: partial graduation hygiene PR per ADR-006.

Verification checklist (for reviewer agent)

  • All 4 tools register via the ε skill registry pattern
  • No INSERT / UPDATE / DELETE SQL anywhere in tools.ts
  • Decay applied on every score-returning read
  • Limit clamp enforced (test with limit = 10000 rejected or clamped)
  • Integration test exists and hand-calc matches
  • Zod schemas validate input edges (negative offset, invalid domain, etc.)
  • npm run build && npm run lint && npm test green

Writeback template

task_update:
  task_id: P2.5.1
  status: done
  progress: 100

thought_record:
  session_id: r91-lambda-phase-2
  thought_type: reflection
  content: |
    task_id: P2.5.1
    branch: feature/p2-5-1-tools
    worktree: .worktrees/claude/p2-5-1-tools
    commit: <sha>
    tests: npm run build && npm run lint && npm test
    summary: Closes λ Phase 2 at 7/7. Four read-only MCP tools (reputation_get, reputation_history, reputation_leaderboard, reputation_check_gates) registered via ε. Lazy decay on every score read. Integration test verifies write→decay→read against hand-calc.
    blockers: none

Common gotchas

  • Decay-on-read is O(N) per leaderboard call — for reputation_leaderboard(domain, 1000), that’s 1000 decays per call. Acceptable for Phase 2 (single-actor posture); a decay-materialization job is a Phase 6+ optimization. Document this in the PR.
  • current_epoch source — never use Date.now(). The tool takes it as a Zod-validated input. For reputation_check_gates, the caller (e.g. admission middleware in κ) supplies the current epoch from src/domains/rules/admission.ts’s epoch field.
  • Tool registration pattern — match the κ admission tool registration in src/server.ts exactly. Grep for admission_evaluate and pattern-match the registration block.
  • Zod output schemas — even though MCP doesn’t strictly require output schemas, define them. They serve as the contract for downstream consumers and the integration-test assertions.
  • Number vs bigint at the JSON boundaryscore is bigint internally, serialized as a JSON number (bps; 0–10000 range is well within Number.MAX_SAFE_INTEGER). Document this conversion.

See also


R89.B λ Phase 2 Staging — 2026-05-12. 7 sub-task prompts authored against base fab4bf57. Phase 2 implementation officially starts at R91 per roadmap.md.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.