P2 — λ Reputation — Agent Prompts
Copy-paste-ready prompts for agents tackling each of the 7 sub-tasks in Phase 2 λ Reputation.
Phase 2 starts at R91 (next round after the R89 staging round and R90 Phase 1 sealing round) per
docs/5-time/roadmap.md. These prompts are staged during R89.B (2026-05-12) so that when R91 opens, every sub-task is a zero-friction T3 dispatch.Canonical spec:
task-breakdown.md §Phase 2Concept doc:docs/3-world/social/reputation.mdReputation spec:docs/spec/s04-reputation.mdExperience tokens spec:docs/spec/s05-experience-tokens.mdArbitration spec (λ-coupling only):docs/spec/s09-arbitration.mdMaster bootstrap prompt:agent-bootstrap.mdExecutor rules:CLAUDE.md— §3 worktree, §5 gate, §6 5-step chain, §7 writebackDesign invariants preserved in every sub-task:
- 64-bit signed integer arithmetic — no floats anywhere (inherits κ P1.1.1).
- Deterministic execution — no RNG / clock / async I/O / network / filesystem in λ computation paths (inherits κ P1.1.2). SHA-256 via
crypto.createHashis the only crypto primitive permitted, and only for pattern-matching feature hashes (s05 §Pattern-matching).- Reputation is non-transferable (s04, s05). No code path may copy a reputation row between
node_idvalues.- Token log is append-only (s05 §Append-only vs garbage collection, AX-01). Derived caches may be pruned; the event log is immutable.
- Score floor is 0; score ceiling is
10000 - scar_bpsper domain. Neither is bypassable.- Scars are permanent (s04 §Permanent scars). A scar row in the schema is never deleted; its weight may be clamped to 0 by arbitration outcome (s05 §Scars supersede; s09 reverses scars only via arbitration appeal).
- All penalty applications are rule evaluations, not administrative writes — they flow through the κ rule engine and carry a version hash (reputation.md §Penalty schedule).
- Five domains: execution, commissioning, arbitration, governance, social. No sixth domain may be added without a roadmap amendment.
- Five token levels in Phase 2: L0, L1, L1.5, L2a, L2b. L3 is deferred to Phase 6 π governance (aggregation across L2b sets — see audit R89.B §3 row 3).
- All λ computation lives under
src/domains/reputation/. The determinism scanner from κ P1.1.2 must be extended to cover this path.Scope bound: do not graduate the prompts to dependency-less parallel PRs. The sub-task graph has hard prerequisites; respect the Depends-on field in each section.
Axis goals
λ Reputation is the per-domain, non-transferable history of an agent’s behavior. It is not a single number; it is five independent scalars (one per domain of action), each with its own decay rate and penalty schedule. Phase 2 ships the reputation arithmetic (compute, decay, penalties, scars), the experience-token issuance pipeline (L0 → L2b), the derived capability gates that downstream axes (θ consensus, π governance) consume, and the read-only MCP tool surface that exposes all of the above. Phase 2 does not ship voting credits’ write-side, L3 aggregation, or arbitration write-side — those are Phase 3/6 surfaces.
Dependency map (upstream closure)
λ Phase 2 has zero unmet upstream dependencies:
| Upstream | Provided by | Path | Shipped in |
|---|---|---|---|
| BPS integer math | bps_mul, bps_div, apply_bps, decay, safe_mul, safe_div |
src/domains/rules/integer-math.ts |
R83 (P1.1.1) |
| BPS named constants | DECAY_*, DAMAGE_*, BPS_* |
src/domains/rules/bps-constants.ts |
R83 (P1.1.3) |
| Determinism scanner | src/__tests__/domains/rules/determinism.test.ts (extend to reputation/**) |
— | R83 (P1.1.2) |
| Rule evaluator | first-match-wins, collect-then-apply | src/domains/rules/engine.ts |
R85 (P1.3.1) |
| Built-in functions | sqrt_floor, log2_floor, etc. |
src/domains/rules/builtins.ts |
R86 (P1.3.2) |
| State access (frozen-Map) | read-only with_binding wrapper |
src/domains/rules/state-access.ts |
R86 (P1.3.3) |
| Policy gating | P1–P13 pre-guards | src/domains/rules/policy.ts |
R86 (P1.3.4) |
| Rule loader / registry | specificity-ordered | src/domains/rules/registry.ts |
R86 (P1.2.4) |
| Admission evaluator | tool-call gates | src/domains/rules/admission.ts |
R87 (P1.4.1) |
| Admission budgets | per-actor budget tracker | src/domains/rules/budgets.ts |
R87 (P1.4.3) |
| Denial reason taxonomy | extensible enum | src/domains/rules/denial-reasons.ts |
R87 (P1.4.2) |
| Version hash | SHA-256 over canonical JSON | src/domains/rules/versioning.ts |
R86 P1.5.1 / R88.A wire |
| Tool-lock middleware | admission adapter | src/middleware/tool-lock.ts |
R87 (P1.4.4) |
| β Task pipeline (CRUD + writeback hard-block) | src/domains/tasks/ |
— | R75 (P0.3.x) |
| ζ Decision trail | src/domains/trail/ |
— | R75 (P0.7.x) |
| η Proof store | src/domains/proof/ |
— | R75 (P0.8.x) |
| ε Skill registry | src/domains/skills/ |
— | R75 (P0.6.x) |
P0 closed at 28/28 in R75 Wave I (d5f6a1ff, 2026-04-18). κ Phase 1 closed at 20/20 in R87 (f327936b, 2026-05-07). λ Phase 2 inherits all of the above as load-bearing dependencies.
Ordering rationale (4 waves, 7 tasks)
Sub-tasks form a strict DAG anchored on P2.1.1 (schema):
┌── P2.1.1 (schema; S; foundation; depends on P1.1.1)
│
┌─────────┼──────────┬──────────┬──────────┐
▼ ▼ ▼ ▼ │
P2.1.2 (M) P2.2.1 (M) P2.2.2 (M) P2.3.1 (L) │
score-compute decay penalties tokens │
│ │
▼ │
P2.4.1 (M) ◀──────────────── P1.3.2 (κ builtins)
derived-limits
│
└──────────────────────────────────────────▶┐
│
▼
P2.5.1 (S)
MCP tools
◀─── P0.3.4 (tool binding)
- Wave 1 — P2.1.1 alone. Foundation gate. Cannot parallelize.
- Wave 2 — P2.1.2 + P2.2.1 + P2.2.2 in parallel (all depend only on P2.1.1, zero file overlap).
- Wave 3 — P2.3.1 + P2.4.1 in parallel (P2.3.1 is L; P2.4.1 depends on P2.1.2 from Wave 2).
- Wave 4 — P2.5.1 alone. Closes λ Phase 2.
R87 shipped 7 κ tasks across 3 waves in ~1h45m. λ adds the foundation gate (Wave 1) and one L-effort task (P2.3.1), so the expected wallclock is ~3–4 hours at R87 pace. The roadmap budget at task-breakdown.md §Task Summary (line ~1136) is 2–3 weeks human-paced; AI-paced compresses substantially.
Roadmap budget reference
task-breakdown.md §Task Summary (around line 1136) declares Phase 2 = 7 tasks / 2–3 weeks / depends on P0 + P1. This prompt file’s seven sub-task entries match those seven roadmap rows 1:1 with no renumbering or scope drift.
Group summary
| Task ID | Title | Depends on | Effort | Unblocks |
|---|---|---|---|---|
| P2.1.1 | Reputation Record Schema | P1.1.1 | S | P2.1.2, P2.2.1, P2.2.2, P2.3.1 |
| P2.1.2 | Score Computation | P2.1.1, P1.3.1 | M | P2.4.1, P2.5.1 |
| P2.2.1 | Exponential Decay | P2.1.1, P1.1.1 | M | P2.5.1 |
| P2.2.2 | Offense Penalties | P2.1.1, P1.1.1 | M | P2.5.1 |
| P2.3.1 | Experience Tokens (L0–L2b) | P2.1.1 | L | P2.5.1 |
| P2.4.1 | Capability Gates (derived limits) | P2.1.2, P1.3.2 | M | P2.5.1 |
| P2.5.1 | Reputation Query MCP Tools | P2.1.2, P2.2.1, P2.2.2, P2.3.1, P2.4.1, P0.3.4 | S | — (closes λ Phase 2) |
Out-of-scope (do not build in Phase 2)
These surfaces are explicitly OUT OF SCOPE for λ Phase 2 and must not appear in any of the seven sub-task implementations:
- L3 aggregate tokens — defer to Phase 6 π governance (s05 §L3 namespace; aggregation across L2b sets is a governance concern).
- Quadratic voting math — defer to Phase 6 π governance; λ exposes only the credit-balance read.
- θ commit-reveal arbiter voting — defer to Phase 3 θ consensus (s09 §Voting); λ exposes only the read-side for the VRF selector.
- ξ Soul Vector binding — defer to Phase 7 ξ identity; Phase 2 uses raw
node_idas the reputation primary key. - Cross-fork L3 recomputation — defer to Phase 5 ι fork (s05 §L3 namespace, “non-transferable across forks”).
- Arbitration write-side (panel selection, escalation, slashing) — defer to Phase 3 θ + Phase 6 π.
- Reputation transfer or staking — constitutional violation (AX-01 append-only + s04 §Non-transferability). Never implemented.
P2.1.1 — Reputation Record Schema
Spec source: task-breakdown.md §P2.1.1
Concept reference: docs/3-world/social/reputation.md §The five domains
Spec docs: s04-reputation.md §Three domains + §Computation (record fields)
Worktree: feature/p2-1-1-rep-schema
Branch command: git worktree add .worktrees/claude/p2-1-1-rep-schema -b feature/p2-1-1-rep-schema origin/main
Estimated effort: S (Small — 1–2 hours)
Depends on: P1.1.1 (BPS integer math — schema columns use bigint encoded as INTEGER)
Unblocks: P2.1.2 (compute), P2.2.1 (decay), P2.2.2 (penalties), P2.3.1 (tokens)
Files to create
src/domains/reputation/schema.ts— TypeScript schema (Zod or hand-rolled) + domain enum + row typessrc/db/migrations/<NN>-reputation.sql— SQLite migration forreputations+reputation_historytables (next migration number after the most recent file insrc/db/migrations/)src/__tests__/domains/reputation/schema.test.ts— Zod validators + migration smoke test
Acceptance criteria
- 5 domain enum:
execution | commissioning | arbitration | governance | social(no other values; sixth-domain attempts must fail validation). reputationstable schema columns:node_id TEXT NOT NULL,domain TEXT NOT NULL,score INTEGER NOT NULL DEFAULT 0(0–10000 bps),scar_bps INTEGER NOT NULL DEFAULT 0(cumulative scar reduction),ban_until_epoch INTEGER(nullable),last_activity_epoch INTEGER NOT NULL. Primary key(node_id, domain).reputation_historytable schema columns:id INTEGER PRIMARY KEY AUTOINCREMENT,node_id TEXT NOT NULL,domain TEXT NOT NULL,epoch INTEGER NOT NULL,delta INTEGER NOT NULL(signed bps),reason TEXT NOT NULL,event_id TEXT NOT NULL. Append-only (no UPDATE or DELETE statements anywhere insrc/domains/reputation/**).- Indexes:
idx_reputations_lookup ON (node_id, domain),idx_reputations_leaderboard ON (domain, score DESC),idx_history_node ON (node_id, domain, epoch DESC). - Score constraints:
score >= 0 AND score <= 10000SQL CHECK (per design invariant 5). - TypeScript types (
Domain,ReputationRow,ReputationHistoryRow) exported fromschema.tswith full property typings. - Migration applies cleanly to a fresh
data/colibri.db; idempotent (running twice does not error). - No mutation methods — only
selectReputation,selectHistory, and aninsertHistoryEventhelper that is the only allowed write path.
Pre-flight reading
CLAUDE.md— §3 worktree, §5 gate, §6 5-step, §7 writeback, §13 git authtask-breakdown.md §P2.1.1reputation.md§The five domains + §Penalty schedule + §Phase 0 postures04-reputation.md§Three domains + §Computationsrc/domains/rules/bps-constants.ts— DECAY_* / DAMAGE_* / BPS_* constants (read but do not duplicate)src/db/index.ts— better-sqlite3 wrapper + migration runner
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.1.1 — Reputation Record Schema
Ship the 5-domain reputation schema (reputations + reputation_history) as the foundation for all of Phase 2 λ.
FILES TO READ FIRST:
1. CLAUDE.md (worktree §3, gate §5, 5-step chain §6, writeback §7, git auth §13)
2. docs/guides/implementation/task-breakdown.md §P2.1.1
3. docs/3-world/social/reputation.md §The five domains + §Phase 0 posture
4. docs/spec/s04-reputation.md §Three domains + §Computation
5. src/domains/rules/bps-constants.ts (consume DECAY_*, DAMAGE_*, BPS_*)
6. src/db/index.ts (migration runner)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-1-1-rep-schema -b feature/p2-1-1-rep-schema origin/main
cd .worktrees/claude/p2-1-1-rep-schema
FILES TO CREATE:
- src/domains/reputation/schema.ts
* export type Domain = "execution" | "commissioning" | "arbitration" | "governance" | "social"
* export const DOMAINS: readonly Domain[] = [...]
* export interface ReputationRow { node_id: string; domain: Domain; score: number /* 0..10000 bps */; scar_bps: number; ban_until_epoch: number | null; last_activity_epoch: number }
* export interface ReputationHistoryRow { id: number; node_id: string; domain: Domain; epoch: number; delta: number; reason: string; event_id: string }
* Zod validators for both row types
* selectReputation(node_id, domain?) -> ReputationRow | ReputationRow[]
* selectHistory(node_id, domain, opts: {limit?: number; offset?: number; before_epoch?: number}) -> ReputationHistoryRow[]
* insertHistoryEvent(row: Omit<ReputationHistoryRow, "id">) -> void // ONLY allowed write path
* NO updateReputation / deleteReputation / deleteHistory — those would violate AX-01
- src/db/migrations/<NN>-reputation.sql
* Pick NN as the next number after the most recent file in src/db/migrations/
* CREATE TABLE reputations (...) with PRIMARY KEY (node_id, domain) and CHECK (score >= 0 AND score <= 10000)
* CREATE TABLE reputation_history (...) with id AUTOINCREMENT
* CREATE INDEX idx_reputations_lookup, idx_reputations_leaderboard, idx_history_node
- src/__tests__/domains/reputation/schema.test.ts
* Zod validators reject sixth domain ("foo")
* Zod validators reject score < 0 or > 10000
* Migration applies cleanly to a fresh in-memory better-sqlite3 db
* Inserting via the helper appends a row; SELECT verifies presence
* Confirm no UPDATE / DELETE exports
ACCEPTANCE CRITERIA (headline):
✓ 5-domain enum, no sixth permitted
✓ reputations + reputation_history schemas with PK and CHECK constraints
✓ Three indexes
✓ Migration idempotent
✓ Append-only write helper (no updateReputation / deleteReputation)
SUCCESS CHECK:
cd .worktrees/claude/p2-1-1-rep-schema && npm run build && npm run lint && npm test
WRITEBACK (after success, per CLAUDE.md §7):
task_update(id="P2.1.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.1.1
branch: feature/p2-1-1-rep-schema
worktree: .worktrees/claude/p2-1-1-rep-schema
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Shipped 5-domain reputation schema (reputations + reputation_history) with PK, CHECK constraint, 3 indexes. Append-only write helper; no update / delete exports.
blockers: none")
FORBIDDENS:
✗ No updateReputation / deleteReputation / deleteHistory exports
✗ No floats — score is integer bps
✗ Do not edit main checkout (CLAUDE.md §3)
✗ Do not skip any of build / lint / test (CLAUDE.md §5)
NEXT:
P2.1.2 — Score Computation (consumes selectHistory + insertHistoryEvent, computes domain scores)
Verification checklist (for reviewer agent)
Domainenum has exactly 5 members; no extrareputations.scoreCHECK constraint present- No
UPDATE reputation*orDELETE FROM reputation*SQL anywhere insrc/domains/reputation/** - All three indexes created
- Migration is idempotent (test runs it twice)
npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.1.1
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.1.1
branch: feature/p2-1-1-rep-schema
worktree: .worktrees/claude/p2-1-1-rep-schema
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: Shipped 5-domain reputation schema with append-only history table. CHECK constraint enforces score bounds; primary key (node_id, domain) prevents duplicate domain rows.
blockers: none
Common gotchas
score INTEGERnot REAL — SQLite will silently accept floats unless the CHECK explicitly forbids them. Usetypeof(score) = 'integer'in the CHECK, or rely on the Zod validator + integer-only inserts at the TS layer.- Migration numbering collisions — list
src/db/migrations/first and pick the next number; do not assume “00X-reputation.sql” is free. If a number is taken, jump to the next. - Index on
(domain, score DESC)— SQLite supportsDESCin indexes since 3.39. Pin the dependency floor inpackage.jsonif it isn’t already. - Five-domain enum drift — if a downstream test hardcodes 4 domains, it must be updated alongside the migration. Grep for
"execution" | "commissioning"patterns and update.
P2.1.2 — Score Computation
Spec source: task-breakdown.md §P2.1.2
Concept reference: reputation.md §The five domains + §Penalty schedule
Spec docs: s04-reputation.md §Computation
Worktree: feature/p2-1-2-score-compute
Branch command: git worktree add .worktrees/claude/p2-1-2-score-compute -b feature/p2-1-2-score-compute origin/main
Estimated effort: M (Medium — 4–8 hours)
Depends on: P2.1.1 (schema + history helper), P1.3.1 (κ rule engine — for ack_weight evaluation)
Unblocks: P2.4.1 (capability gates consume score reads), P2.5.1 (reputation_get returns computed scores)
Files to create
src/domains/reputation/compute.ts— Pure score-computation functionssrc/__tests__/domains/reputation/compute.test.ts— Unit + property tests
Acceptance criteria
compute_score(node_id, domain, events: ReputationHistoryRow[]): bigint— Σ(ack_weight × event_outcome) over all events in domain.ack_weightis the acknowledger’s reputation in the same domain, bounded to prevent feedback loops (cap at the acknowledger’s currentscoreand never exceedBPS_100_PERCENT = 10000n).- All arithmetic uses
src/domains/rules/integer-math.tshelpers (bps_mul,bps_div,safe_mul,safe_div). - Score cap:
min(10000n - scar_bps, computed)— never exceeds10000 - scar_bps. - Score floor: 0n (clamp negatives — penalties applied separately via P2.2.2).
- Property test: for any sequence of positive-outcome events, score is monotonically non-decreasing.
- Property test:
compute_score(n, d, events)is deterministic — same input array → byte-identical output. - Property test:
ack_weightcap holds — no acknowledger ever contributes more than their own current score. - No
Math.*, noDate.*, noMath.random. Validated by the κ P1.1.2 determinism scanner (extend its globs tosrc/domains/reputation/**in this PR).
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.1.2reputation.md§The five domainss04-reputation.md§Computationsrc/domains/rules/integer-math.ts(P1.1.1)src/domains/rules/bps-constants.ts(P1.1.3)src/domains/reputation/schema.ts(P2.1.1 — selectHistory)src/__tests__/domains/rules/determinism.test.ts(P1.1.2 — extend itssrc/domains/rules/**glob)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.1.2 — Score Computation
Implement compute_score(node_id, domain, events[]) using κ integer-math; enforce ack_weight feedback-loop cap; ensure monotonicity and determinism.
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.1.2
3. docs/3-world/social/reputation.md §The five domains
4. docs/spec/s04-reputation.md §Computation
5. src/domains/rules/integer-math.ts (P1.1.1 helpers)
6. src/domains/rules/bps-constants.ts (P1.1.3 named constants)
7. src/domains/reputation/schema.ts (P2.1.1 row types + selectHistory)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-1-2-score-compute -b feature/p2-1-2-score-compute origin/main
cd .worktrees/claude/p2-1-2-score-compute
FILES TO CREATE:
- src/domains/reputation/compute.ts
* export function compute_score(
node_id: string,
domain: Domain,
events: ReputationHistoryRow[],
ack_lookup: (acker_id: string, dom: Domain) => bigint // current score of the acknowledger
): bigint
* Algorithm:
1. score := 0n
2. for each event in events (in epoch ASC order):
2a. ack := ack_lookup(event.acker_id, domain) // acker id is encoded in event.reason or event.event_id metadata
2b. ack_capped := min(ack, BPS_100_PERCENT) // hard cap to prevent runaway
2c. weighted := bps_mul(BigInt(event.delta), ack_capped)
2d. score := score + weighted
3. score := max(0n, score) // floor
4. score := min(score, BPS_100_PERCENT - scar_bps_for(node_id, domain)) // ceiling
5. return score
* All ops via integer-math.ts; no Math.*, no Date.*
- src/__tests__/domains/reputation/compute.test.ts
* Unit: empty events -> 0n
* Unit: one positive event with ack_lookup -> 10000n -> score equals event.delta
* Unit: ack_lookup capped at 10000n even when acker has higher conceptual score
* Unit: scar_bps reduces ceiling
* Property (fast-check, 1000 iter): for events with delta > 0, sorting by epoch and folding produces monotonically non-decreasing partial scores
* Property (1000 iter): determinism — two runs with same inputs produce byte-identical output
* Extend src/__tests__/domains/rules/determinism.test.ts (or create domains/reputation/determinism.test.ts) to scan src/domains/reputation/**
ACCEPTANCE CRITERIA (headline):
✓ compute_score uses integer-math only
✓ ack_weight feedback-loop cap at BPS_100_PERCENT
✓ Score clamped to [0n, 10000n - scar_bps]
✓ Monotonicity property (positive-only -> non-decreasing)
✓ Determinism scanner extended to src/domains/reputation/**
SUCCESS CHECK:
cd .worktrees/claude/p2-1-2-score-compute && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.1.2", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.1.2
branch: feature/p2-1-2-score-compute
worktree: .worktrees/claude/p2-1-2-score-compute
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Implemented compute_score(node_id, domain, events, ack_lookup) using κ integer-math; ack_weight feedback-loop cap; monotonicity + determinism property tests over 1000 iterations each.
blockers: none")
FORBIDDENS:
✗ No Math.*, no Date.*, no Math.random in src/domains/reputation/**
✗ No floats anywhere — score is bigint
✗ Do not redefine bps_mul / safe_mul — import from integer-math.ts
✗ Do not edit main checkout
NEXT:
P2.4.1 — Capability Gates (consumes compute_score for max_parallel_tasks, can_arbitrate, etc.)
Verification checklist (for reviewer agent)
- All scoring arithmetic via
integer-math.ts(grepimport) ack_weightcap applied before multiplication- Score clamped to
[0n, 10000n - scar_bps] - Property tests run 1000 iterations each
- Determinism scanner covers
src/domains/reputation/** npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.1.2
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.1.2
branch: feature/p2-1-2-score-compute
worktree: .worktrees/claude/p2-1-2-score-compute
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: Score computation: Σ(bounded_ack_weight × event_outcome). Property tests assert monotonicity under positive-only sequences and byte-identical determinism over 1000 iterations.
blockers: none
Common gotchas
- bigint vs number —
event.deltafrom SQLite arrives as a JSnumber. Wrap inBigInt(event.delta)at the boundary; never multiply abigintby anumber(throwsTypeError). - Acknowledger lookup recursion — if the acker’s score is itself derived from this fold, you get infinite recursion. The signature uses
ack_lookupas an opaque function; the implementation in P2.5.1 will read the snapshot from thereputationstable (not recursively re-fold). - Empty events array — return
0n, notnullorundefined. Downstream consumers (P2.4.1, P2.5.1) assume a bigint always. - Property test fast-check seed — pin a seed so CI flakes are reproducible. Use the same seed-pinning pattern as
src/__tests__/domains/rules/determinism.test.ts.
P2.2.1 — Exponential Decay
Spec source: task-breakdown.md §P2.2.1
Concept reference: reputation.md §The five domains (decay rates table)
Spec docs: s04-reputation.md §Decay
Worktree: feature/p2-2-1-decay
Branch command: git worktree add .worktrees/claude/p2-2-1-decay -b feature/p2-2-1-decay origin/main
Estimated effort: M (Medium — 4–8 hours)
Depends on: P2.1.1 (schema), P1.1.1 (decay() from integer-math)
Unblocks: P2.5.1 (reputation_get returns decayed score)
Files to create
src/domains/reputation/decay.ts— Per-domain decay applicationsrc/__tests__/domains/reputation/decay.test.ts— Unit + batch tests
Acceptance criteria
apply_decay(row: ReputationRow, current_epoch: bigint): ReputationRow— pure function; returns new row with decayed score and unchangedlast_activity_epoch.- Per-domain decay rates (from
bps-constants.ts):DECAY_EXECUTION = 500n(execution)DECAY_COMMISSIONING = 300n(commissioning)DECAY_ARBITRATION = 1000n(arbitration)DECAY_GOVERNANCE = 200n(governance)DECAY_SOCIAL = 100n(social)
- Decay applies only during inactivity:
inactive_epochs = max(0, current_epoch - last_activity_epoch). - Activity resets timer:
last_activity_epochis updated by the event-write path (P2.1.1 helper) — decay does not modify it. - Multi-epoch compound: uses
decay(score, rate, epochs)frominteger-math.ts(P1.1.1). - Score floor at 0 (already enforced by
decay()per P1.1.1 acceptance criteria — re-test here). - Batch helper
apply_decay_batch(rows: ReputationRow[], current_epoch: bigint): ReputationRow[]— pure; efficient for 10,000+ rows (no per-row I/O). - Property test: re-applying decay twice (decay → decay) is not equivalent to one decay with double epochs (compound effect tested explicitly).
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.2.1reputation.md§The five domainss04-reputation.md§Decaysrc/domains/rules/integer-math.ts(P1.1.1,decay())src/domains/rules/bps-constants.ts(P1.1.3 — DECAY_* exports)src/domains/reputation/schema.ts(P2.1.1 row types)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.2.1 — Exponential Decay
Implement per-domain reputation decay using the κ integer-math decay() primitive. Pure functions; no DB writes here (those land in P2.5.1's tool surface).
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.2.1
3. docs/3-world/social/reputation.md §The five domains
4. docs/spec/s04-reputation.md §Decay
5. src/domains/rules/integer-math.ts (P1.1.1 decay() helper)
6. src/domains/rules/bps-constants.ts (P1.1.3 DECAY_*)
7. src/domains/reputation/schema.ts (P2.1.1)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-2-1-decay -b feature/p2-2-1-decay origin/main
cd .worktrees/claude/p2-2-1-decay
FILES TO CREATE:
- src/domains/reputation/decay.ts
* export function rate_for(domain: Domain): bigint
Returns DECAY_EXECUTION | DECAY_COMMISSIONING | DECAY_ARBITRATION | DECAY_GOVERNANCE | DECAY_SOCIAL.
* export function apply_decay(row: ReputationRow, current_epoch: bigint): ReputationRow
1. inactive = current_epoch - BigInt(row.last_activity_epoch)
2. if inactive <= 0n: return row unchanged
3. rate = rate_for(row.domain)
4. new_score = decay(BigInt(row.score), rate, inactive) // from integer-math.ts
5. return { ...row, score: Number(new_score) }
* export function apply_decay_batch(rows: ReputationRow[], current_epoch: bigint): ReputationRow[]
Pure map over apply_decay; no side effects.
- src/__tests__/domains/reputation/decay.test.ts
* Unit: row at epoch 100, current 100 -> unchanged
* Unit: row at epoch 100, current 110, execution domain -> decay by 500bps × 10 epochs
* Unit: row with score 0 stays at 0 (floor)
* Property: compound effect — decay(decay(x, r, e1), r, e2) != decay(x, r, e1+e2) because per-step floor compounds
* Batch test: 10,000 rows decay in under 50ms on dev hardware (smoke perf check)
* Determinism: same input -> same output across two runs
ACCEPTANCE CRITERIA (headline):
✓ Per-domain decay via rate_for(domain)
✓ apply_decay pure; never mutates input
✓ Inactivity-only (epoch >= last_activity)
✓ Score floored at 0
✓ Batch helper for 10k+ rows
SUCCESS CHECK:
cd .worktrees/claude/p2-2-1-decay && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.2.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.2.1
branch: feature/p2-2-1-decay
worktree: .worktrees/claude/p2-2-1-decay
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Per-domain reputation decay using integer-math decay() + DECAY_* constants. apply_decay (single row) and apply_decay_batch (10k+ rows pure). Compound effect verified property-style.
blockers: none")
FORBIDDENS:
✗ Do not write to DB in decay.ts — pure functions only
✗ Do not redefine decay() — import from integer-math.ts
✗ Do not edit last_activity_epoch in decay (that's the event-write path's job)
✗ Do not skip build / lint / test
NEXT:
P2.5.1 — Reputation Query MCP Tools (composes compute + decay + penalties + tokens + limits)
Verification checklist (for reviewer agent)
rate_forcovers all 5 domains; exhaustive switch with TSnevercheckapply_decaydoes not mutate input (test withObject.freeze)- Batch helper has no per-row I/O
- Floor at 0 verified explicitly
npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.2.1
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.2.1
branch: feature/p2-2-1-decay
worktree: .worktrees/claude/p2-2-1-decay
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: Per-domain reputation decay (DECAY_EXECUTION 500bps/epoch ... DECAY_SOCIAL 100bps/epoch). apply_decay pure single-row; apply_decay_batch pure 10k+ rows. Floor at 0 enforced.
blockers: none
Common gotchas
- Last-activity update belongs elsewhere — decay never touches
last_activity_epoch; that field updates only when an event is inserted intoreputation_history. Test enforces this. Object.freezeis shallow — if you freeze a row and then mutaterow.score, JS silently no-ops. Always use spread ({...row, score: ...}).- Batch perf without I/O — the 10k row smoke test must not import
better-sqlite3; that’s a P2.5.1 concern. Build the row array in-memory. - Number ↔ bigint at boundary — schema stores INTEGER; TS reads it as
number. Convert at the function boundary:BigInt(row.score)going in,Number(new_score)coming out. Document the precision risk (score is 0–10000, well within Number range).
P2.2.2 — Offense Penalties
Spec source: task-breakdown.md §P2.2.2
Concept reference: reputation.md §Penalty schedule
Spec docs: s04-reputation.md §Damage table + §Permanent scars; s05-experience-tokens.md §Decay and scar supersession; s09-arbitration.md §Arbiter constraints (overturned-decision row only)
Worktree: feature/p2-2-2-penalties
Branch command: git worktree add .worktrees/claude/p2-2-2-penalties -b feature/p2-2-2-penalties origin/main
Estimated effort: M (Medium — 4–8 hours)
Depends on: P2.1.1 (schema), P1.1.1 (apply_bps from integer-math), P1.1.3 (DAMAGE_* constants)
Unblocks: P2.5.1 (reputation_get exposes scar_bps and ban_until)
Files to create
src/domains/reputation/penalties.ts— Severity band → penalty applicationsrc/__tests__/domains/reputation/penalties.test.ts— Per-band unit tests + double-jeopardy property
Acceptance criteria
- Severity band enum (from reputation.md §Penalty schedule + task-breakdown.md §P2.2.2):
Minor→ DAMAGE_MINOR (1500 bps)Moderate→ DAMAGE_MODERATE (3000 bps)Severe→ DAMAGE_SEVERE (5000 bps)Critical→ DAMAGE_CRITICAL (8000 bps) + banFraud→ DAMAGE_FRAUD (10000 bps) + ban + scar
apply_penalty(row, band, current_epoch, event_id): { row, history_event }— returns updated row + the history event to insert via P2.1.1’s append-only helper.- Scar mechanism:
Fraudband addsDAMAGE_FRAUD = 10000ntoscar_bps(capped at 10000n — score can never recover beyond 0% if fraud reaches absolute ceiling). Per s04 §Permanent scars: max achievable reputation capped at10000 - scar_bps. - Ban mechanism:
Critical/Fraudbands setban_until_epoch = current_epoch + BAN_DURATION_EPOCHS(BAN_DURATION_EPOCHS exported as a constant; recommend 100 epochs as a starting governance parameter). - Double-jeopardy guard:
apply_penaltyrejects (throws or returns{ row: unchanged }) when the (event_id, band) tuple already exists inreputation_history. Caller responsible for the lookup; helper exposesis_double_penalty(event_id, band, history): boolean. - Recovery path: after
ban_until_epochpasses, node resumes at scar-limited maximum (no additional logic — the gate is read-only onban_until_epoch < current_epoch). - All BPS math via
integer-math.ts(apply_bps). - No deletion of history; penalties append a new row with negative
delta.
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.2.2reputation.md§Penalty schedules04-reputation.md§Damage table + §Permanent scarss05-experience-tokens.md§Decay and scar supersessions09-arbitration.md§Arbiter constraintssrc/domains/rules/integer-math.ts(apply_bps)src/domains/rules/bps-constants.ts(DAMAGE_*)src/domains/reputation/schema.ts(P2.1.1 row types)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.2.2 — Offense Penalties
Implement the 5-band penalty system (Minor / Moderate / Severe / Critical / Fraud) with scar + ban mechanisms and double-jeopardy guard.
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.2.2
3. docs/3-world/social/reputation.md §Penalty schedule
4. docs/spec/s04-reputation.md §Damage table + §Permanent scars
5. docs/spec/s05-experience-tokens.md §Decay and scar supersession
6. docs/spec/s09-arbitration.md §Arbiter constraints
7. src/domains/rules/integer-math.ts (apply_bps)
8. src/domains/rules/bps-constants.ts (DAMAGE_*)
9. src/domains/reputation/schema.ts (P2.1.1)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-2-2-penalties -b feature/p2-2-2-penalties origin/main
cd .worktrees/claude/p2-2-2-penalties
FILES TO CREATE:
- src/domains/reputation/penalties.ts
* export type SeverityBand = "minor" | "moderate" | "severe" | "critical" | "fraud"
* export const BAN_DURATION_EPOCHS = 100n // governance parameter
* export function damage_for(band: SeverityBand): bigint
Returns DAMAGE_MINOR ... DAMAGE_FRAUD.
* export function is_double_penalty(event_id: string, band: SeverityBand, history: ReputationHistoryRow[]): boolean
* export function apply_penalty(
row: ReputationRow,
band: SeverityBand,
current_epoch: bigint,
event_id: string,
reason: string
): { row: ReputationRow; history_event: Omit<ReputationHistoryRow, "id"> }
1. damage = damage_for(band)
2. new_score = max(0n, BigInt(row.score) - apply_bps(BigInt(row.score), damage)) // apply_bps -> (val * bps / 10000)
3. new_scar_bps = (band === "fraud") ? min(10000n, BigInt(row.scar_bps) + DAMAGE_FRAUD) : BigInt(row.scar_bps)
4. new_ban_until = (band === "critical" || band === "fraud")
? current_epoch + BAN_DURATION_EPOCHS
: (row.ban_until_epoch ?? null)
5. row_out = { ...row, score: Number(new_score), scar_bps: Number(new_scar_bps), ban_until_epoch: ... }
6. history_event = { node_id: row.node_id, domain: row.domain, epoch: Number(current_epoch), delta: -Number(apply_bps(BigInt(row.score), damage)), reason, event_id }
7. return { row: row_out, history_event }
- src/__tests__/domains/reputation/penalties.test.ts
* Per-band unit: each of {minor, moderate, severe, critical, fraud} produces expected delta
* Scar: fraud adds 10000n to scar_bps; clamped at 10000n max
* Ban: critical/fraud sets ban_until_epoch = current + 100
* Non-critical bands leave ban_until_epoch unchanged
* Double-jeopardy: is_double_penalty returns true when (event_id, band) already in history
* Floor: applying minor to score=0 stays at 0
* Append-only: penalty produces a history_event for caller to insert (no DB write inside apply_penalty)
ACCEPTANCE CRITERIA (headline):
✓ 5 severity bands wired to DAMAGE_*
✓ Scar appends only on fraud, capped at 10000n
✓ Ban only on critical/fraud
✓ Double-jeopardy guard helper
✓ Pure function — caller does the DB insert
SUCCESS CHECK:
cd .worktrees/claude/p2-2-2-penalties && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.2.2", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.2.2
branch: feature/p2-2-2-penalties
worktree: .worktrees/claude/p2-2-2-penalties
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: 5-band penalty system (minor/moderate/severe/critical/fraud) wired to DAMAGE_* constants. Scar on fraud (permanent ceiling cut). Ban on critical/fraud (100 epoch governance parameter). Double-jeopardy guard. Pure apply_penalty returns row + history_event for caller insert.
blockers: none")
FORBIDDENS:
✗ Do not DELETE or UPDATE reputation_history rows — append-only
✗ Do not subtract negative damage to score — damage is always positive bps
✗ Do not write to DB inside apply_penalty
✗ Do not edit main checkout
NEXT:
P2.5.1 — Reputation Query MCP Tools (composes the apply_penalty output with reputation_get + reputation_history reads)
Verification checklist (for reviewer agent)
- All 5 bands have unit-test coverage
- Scar is additive and clamped at 10000n
- Ban set only for critical/fraud
- No DB writes inside
apply_penalty(grep fordb.prepare/db.exec) - Double-jeopardy guard helper present and tested
npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.2.2
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.2.2
branch: feature/p2-2-2-penalties
worktree: .worktrees/claude/p2-2-2-penalties
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: 5-band penalty system (minor 1500 → fraud 10000 bps). Scar on fraud (permanent ceiling reduction). Ban on critical/fraud (100 epochs). Double-jeopardy guard. apply_penalty is pure; caller does the history append.
blockers: none
Common gotchas
- Damage vs delta sign —
damage_forreturns positive bps; thehistory_event.deltais negative (penalty reduces score). Get the sign right at the boundary. - Scar accumulation vs cap — sequential fraud penalties keep adding to
scar_bpsbut it’s clamped at10000n— at full scar, the node’s score ceiling is 0 forever. Test this explicitly. - Reason string source —
reasonshould match a κ denial-reason taxonomy entry (R87 P1.4.2). E.g."REP_FRAUD_PROVEN". Reference the κ taxonomy file; do not invent new reason strings. - Ban duration as a governance parameter —
BAN_DURATION_EPOCHS = 100nis a starting value. Phase 6 π governance will tune via rule upgrade. Document this in the comment block.
P2.3.1 — Experience Tokens (L0–L2b)
Spec source: task-breakdown.md §P2.3.1
Concept reference: reputation.md §Experience tokens
Spec docs: s05-experience-tokens.md §Token levels + §Promotion flow + §Witness registry + §Pattern-matching algorithm
Worktree: feature/p2-3-1-tokens
Branch command: git worktree add .worktrees/claude/p2-3-1-tokens -b feature/p2-3-1-tokens origin/main
Estimated effort: L (Large — 8–16 hours)
Depends on: P2.1.1 (schema; reuses migration pattern), P1.1.1 (bigint encoding)
Unblocks: P2.5.1 (reputation_get exposes token counts per domain)
Files to create
src/domains/reputation/tokens.ts— Token issuance + promotion enginesrc/domains/reputation/witnesses.ts— Witness registry CRUD + independence rulesrc/db/migrations/<NN>-experience-tokens.sql—experience_tokens+mcp_witnessestablessrc/__tests__/domains/reputation/tokens.test.tssrc/__tests__/domains/reputation/witnesses.test.ts
Acceptance criteria
- 5 token levels:
L0 | L1 | L1.5 | L2a | L2b(no L3 — deferred to Phase 6). experience_tokenstable with columns:id TEXT PRIMARY KEY(ULID liketok_01HXYZ...),node_id TEXT NOT NULL,level TEXT NOT NULL CHECK (level IN ('L0','L1','L1.5','L2a','L2b')),domain TEXT NOT NULL,scenario TEXT,counterparty TEXT,action TEXT NOT NULL,outcome_class TEXT NOT NULL,outcome_delta INTEGER NOT NULL,witnesses TEXT(JSON array of witness_ids),created_at INTEGER NOT NULL,promoted_from TEXT(chain to upstream token),feature_hash TEXT(SHA-256 for L2a+ matching).mcp_witnessestable with columns:witness_id TEXT PRIMARY KEY,agent_id TEXT NOT NULL,target_node_id TEXT NOT NULL,target_episode_id TEXT NOT NULL,reputation_at_witness INTEGER NOT NULL(frozen; see s05),weight_cap INTEGER NOT NULL(≤ 30, expressed as bps × 100 to stay integer),counterparty_class TEXT NOT NULL,created_at INTEGER NOT NULL.- L0 auto-mint on event completion:
mint_L0(node_id, domain, action, outcome): Token. - L0 → L1 promotion requires interaction cycle complete (all GSD FSM phases per β P0.3.1; counterparty confirms delivery).
promote_to_L1(L0_token, cycle_proof): Token. - L1 → L1.5 promotion requires ≥1 witness; witness rules per s05 §Witness registry:
reputation_at_witness ≥ 200(frozen at witness time)- per-witness
weight_cap ≤ 0.3(encoded as integer 30 since bigint math; multiply by 100 → 3000 for bps×100 representation) - sum cap
Σ weights ≤ 0.4 × MIN_EPISODES - independence: ≤1 witness per counterparty class per rolling 7-day window
- L1/L1.5 → L2a promotion requires ≥5 tokens with same
feature_hash, spanning ≥3 distinctscenariovalues AND ≥3 distinctcounterpartyclasses (diversity gate). - L2a → L2b promotion requires invariance check: replay token’s outcome via κ engine with
contextzeroed;outcome_classmust match original. - Pattern matching:
feature_hash = SHA-256(canonical_context || action_type || outcome_class).canonical_contextis JSON-stringified with sorted keys; unknown values bucketed to*. - Append-only: no UPDATE or DELETE on
experience_tokensormcp_witnesses. Promotion creates a new row withpromoted_fromreferencing the upstream. - Non-transferable: no helper allows changing
node_idon an existing token row. - Token-counts query:
count_tokens_by_level(node_id, domain, level): number.
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.3.1reputation.md§Experience tokenss05-experience-tokens.mdfull read (every subsection matters)src/domains/reputation/schema.ts(P2.1.1 — migration pattern)src/domains/rules/engine.ts(P1.3.1 — for L2b invariance replay)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.3.1 — Experience Tokens (L0–L2b)
Ship the full token issuance + promotion pipeline per s05. Includes witness registry with independence rule and SHA-256 feature hashing for L2a pattern matching.
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.3.1
3. docs/3-world/social/reputation.md §Experience tokens
4. docs/spec/s05-experience-tokens.md (FULL READ — every subsection: Token levels, Promotion flow, Witness registry, Pattern-matching algorithm, L3 namespace [defer; do not implement], Decay and scar supersession, Append-only vs garbage collection)
5. src/domains/reputation/schema.ts (P2.1.1 — migration pattern)
6. src/domains/rules/engine.ts (P1.3.1 — for L2b invariance replay)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-3-1-tokens -b feature/p2-3-1-tokens origin/main
cd .worktrees/claude/p2-3-1-tokens
FILES TO CREATE:
- src/domains/reputation/tokens.ts
* export type TokenLevel = "L0" | "L1" | "L1.5" | "L2a" | "L2b"
* export interface Token {
id: string; // tok_01HXYZ... (ULID)
node_id: string;
level: TokenLevel;
domain: Domain;
scenario: string | null;
counterparty: string | null;
action: string;
outcome_class: string;
outcome_delta: number;
witnesses: string[]; // JSON-stored
created_at: number;
promoted_from: string | null;
feature_hash: string | null; // SHA-256 hex; null for L0/L1
}
* mint_L0(node_id, domain, action, outcome): Token // auto-mint on event
* promote_to_L1(l0: Token, cycle_proof: CycleProof): Token // requires full FSM cycle
* promote_to_L1_5(l1: Token, witnesses: Witness[]): Token // witness rules enforced
* promote_to_L2a(tokens: Token[]): Token // 5+ same feature_hash, diversity gate
* promote_to_L2b(l2a: Token, engine: RuleEngine): Token // invariance replay via κ
* count_tokens_by_level(node_id, domain, level): number
* feature_hash(context, action_type, outcome_class): string // SHA-256 hex
* canonicalize_context(ctx): object // sorted keys, * for unknowns
* NO updateToken / deleteToken exports — append-only
- src/domains/reputation/witnesses.ts
* export interface Witness {
witness_id: string;
agent_id: string;
target_node_id: string;
target_episode_id: string;
reputation_at_witness: number;
weight_cap: number; // bps × 100, ≤ 30
counterparty_class: string;
created_at: number;
}
* register_witness(input): Witness // floor check + weight cap + independence rule
* check_independence(target, counterparty_class, now): boolean // max 1 per class per 7-day window
* total_witness_weight(target_episode_id): number // sum cap ≤ 0.4 × MIN_EPISODES
- src/db/migrations/<NN>-experience-tokens.sql
* CREATE TABLE experience_tokens (...) with CHECK on level enum
* CREATE TABLE mcp_witnesses (...) with reputation_at_witness ≥ 200 CHECK
* Indexes: (node_id, domain, level), (feature_hash), (target_node_id, counterparty_class, created_at)
- src/__tests__/domains/reputation/tokens.test.ts
* L0 mint: every completed event produces a token
* L1 promotion: requires cycle_proof valid; absent proof → stays at L0
* L1.5 promotion: witness floor enforced (rep < 200 rejected)
* L1.5 weight cap: per-witness ≤ 0.3 (encoded as 30 in bps×100)
* L2a promotion: requires ≥5 same-feature_hash tokens, ≥3 scenarios, ≥3 counterparty classes
* L2a diversity gate: 4 tokens or 2 scenarios or 2 counterparties → rejected
* L2b invariance: pass + fail cases (context-removed outcome matches / differs)
* Feature hash determinism: same input → same SHA-256 across runs
* Append-only: no exports allow updates or deletes (TS-level)
* Non-transferable: no helper changes node_id
- src/__tests__/domains/reputation/witnesses.test.ts
* register_witness: rep<200 rejected
* register_witness: weight_cap>30 rejected
* check_independence: same counterparty class within 7 days rejected
* check_independence: same class >7 days apart accepted
* total_witness_weight: sum cap enforced at registration time
ACCEPTANCE CRITERIA (headline):
✓ 5 token levels (L0/L1/L1.5/L2a/L2b)
✓ SHA-256 feature hash with canonical context
✓ Witness registry with floor + per-cap + sum-cap + 7-day independence
✓ L2a diversity gate (≥5, ≥3 scenarios, ≥3 counterparties)
✓ L2b invariance via κ engine replay
✓ Append-only; non-transferable
SUCCESS CHECK:
cd .worktrees/claude/p2-3-1-tokens && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.3.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.3.1
branch: feature/p2-3-1-tokens
worktree: .worktrees/claude/p2-3-1-tokens
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Experience token pipeline (L0→L1→L1.5→L2a→L2b) with SHA-256 feature hashing, witness registry (floor 200, per-cap 0.3, sum-cap 0.4×MIN_EPISODES, 7-day independence), diversity gate, and κ invariance replay for L2b. L3 deferred to Phase 6. Append-only and non-transferable enforced at TS + SQL layers.
blockers: none")
FORBIDDENS:
✗ Do not ship L3 — that's Phase 6 π governance
✗ Do not allow updateToken / deleteToken — AX-01 violation
✗ Do not allow node_id change on a token — non-transferability
✗ Do not use Math.random for token IDs — use crypto.randomUUID or ULID with seeded entropy
✗ Do not skip CLAUDE.md gates
NEXT:
P2.4.1 — Capability Gates (consumes count_tokens_by_level for tier unlocks)
Verification checklist (for reviewer agent)
- No L3 anywhere
- No
updateToken/deleteToken/node_idmutation helpers - SHA-256 used (not other hash)
- Witness reputation floor and 7-day independence both tested
- L2a diversity gate (3 scenarios × 3 counterparties) tested
- L2b invariance replay calls into
src/domains/rules/engine.ts - Both migrations apply cleanly
Writeback template
task_update:
task_id: P2.3.1
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.3.1
branch: feature/p2-3-1-tokens
worktree: .worktrees/claude/p2-3-1-tokens
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: L0–L2b experience tokens shipped (L3 deferred per spec). Witness registry with floor 200, per-cap 0.3, sum-cap 0.4×MIN_EPISODES, 7-day independence rule. Pattern matching via SHA-256 feature_hash; diversity gate (≥3 scenarios × ≥3 counterparties); L2b invariance via κ engine replay. Append-only enforced via no-mutation exports + SQL.
blockers: none
Common gotchas
- L3 temptation — s05 §L3 namespace describes the L3 aggregation; it is explicitly out of scope for Phase 2 (audit §3 row 3). Do not add a
promote_to_L3even if it seems like a small step. Phase 6 will define it against the governance rule weights. - Canonical-context JSON ordering —
JSON.stringifyis unstable; use a sorted-key serializer (canonical-jsonlibrary or hand-rolled). The feature hash must be byte-identical across runs. - ULID vs UUID — s05 uses ULID prefixes (
tok_01HXYZ...); pick a deterministic-but-monotonic ID generator.crypto.randomUUIDis fine if you prefix manually; or importulid(small dep). Do not useMath.random. - Witness weight encoding — s05 says
weight_cap ≤ 0.3. Float-free representation: store as integer30and document the unit (bps × 100 — 30 ⇒ 0.3). Updatetotal_witness_weightto apply the same scaling. - MIN_EPISODES — s05 names this constant but does not define it. Define
MIN_EPISODES = 5ninbps-constants.ts(extend file if needed) and document in PR.
P2.4.1 — Capability Gates (derived limits)
Spec source: task-breakdown.md §P2.4.1
Concept reference: reputation.md §Derived limits
Spec docs: s04-reputation.md §Derived limits; s09-arbitration.md §Arbiter selection (eligibility)
Worktree: feature/p2-4-1-limits
Branch command: git worktree add .worktrees/claude/p2-4-1-limits -b feature/p2-4-1-limits origin/main
Estimated effort: M (Medium — 4–8 hours)
Depends on: P2.1.2 (score reads), P1.3.2 (κ built-ins — sqrt_floor, log2_floor)
Unblocks: P2.5.1 (reputation_check_gates exposes all derived limits)
Files to create
src/domains/reputation/limits.ts— Pure derivation from current score rowsrc/__tests__/domains/reputation/limits.test.ts— Boundary tests per derivation
Acceptance criteria
max_parallel_tasks(rep: ReputationRow): bigint=min(sqrt_floor(execution_rep), 20n)per s04 §Derived limits. Usessqrt_floorfromsrc/domains/rules/builtins.ts(P1.3.2).rate_limit_bonus(rep: ReputationRow, base_rate: bigint): bigint=bps_mul(base_rate, log2_floor(max(execution_rep, 1n)))per s04. Useslog2_floorfrom κ built-ins.stake_discount(required_stake: bigint, rep: ReputationRow): bigint=safe_div(safe_mul(required_stake, BPS_100_PERCENT), max(execution_rep, 1000n))per task-breakdown.md §P2.4.1.can_arbitrate(rep_arbitration: ReputationRow, rep_execution: ReputationRow): boolean=arbitration_score >= 5000n AND execution_score >= 3000nper task-breakdown.md §P2.4.1.can_govern(rep_governance: ReputationRow): boolean=governance_score >= 4000n.- Banned nodes:
can_arbitrateandcan_governreturnfalsewhenban_until_epoch > current_epoch. Pure function; takescurrent_epochas a parameter (no Date.now). - All BPS math via
integer-math.ts; all built-ins viasrc/domains/rules/builtins.ts. - Each derivation has 3 boundary unit tests: zero, threshold, above-threshold.
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.4.1reputation.md§Derived limitss04-reputation.md§Derived limitss09-arbitration.md§Arbiter selectionsrc/domains/rules/builtins.ts(P1.3.2 —sqrt_floor,log2_floor)src/domains/rules/integer-math.ts(P1.1.1 helpers)src/domains/reputation/schema.ts(P2.1.1 row types)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.4.1 — Capability Gates (derived limits)
Ship the pure-function derivation layer: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern. All BPS math; all use κ built-ins for sqrt and log2.
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.4.1
3. docs/3-world/social/reputation.md §Derived limits
4. docs/spec/s04-reputation.md §Derived limits
5. docs/spec/s09-arbitration.md §Arbiter selection
6. src/domains/rules/builtins.ts (P1.3.2 — sqrt_floor, log2_floor)
7. src/domains/rules/integer-math.ts (P1.1.1)
8. src/domains/reputation/schema.ts (P2.1.1)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-4-1-limits -b feature/p2-4-1-limits origin/main
cd .worktrees/claude/p2-4-1-limits
FILES TO CREATE:
- src/domains/reputation/limits.ts
* export function max_parallel_tasks(rep_execution: ReputationRow): bigint
return min(sqrt_floor(BigInt(rep_execution.score)), 20n)
* export function rate_limit_bonus(rep_execution: ReputationRow, base_rate: bigint): bigint
return bps_mul(base_rate, log2_floor(max(BigInt(rep_execution.score), 1n)))
* export function stake_discount(required_stake: bigint, rep_execution: ReputationRow): bigint
return safe_div(safe_mul(required_stake, BPS_100_PERCENT), max(BigInt(rep_execution.score), 1000n))
* export function can_arbitrate(rep_arbitration: ReputationRow, rep_execution: ReputationRow, current_epoch: bigint): boolean
1. if rep_arbitration.ban_until_epoch && BigInt(rep_arbitration.ban_until_epoch) > current_epoch: return false
2. return BigInt(rep_arbitration.score) >= 5000n && BigInt(rep_execution.score) >= 3000n
* export function can_govern(rep_governance: ReputationRow, current_epoch: bigint): boolean
1. if rep_governance.ban_until_epoch && BigInt(rep_governance.ban_until_epoch) > current_epoch: return false
2. return BigInt(rep_governance.score) >= 4000n
- src/__tests__/domains/reputation/limits.test.ts
* max_parallel_tasks: rep=0 -> 0n; rep=10000 -> sqrt_floor(10000)=100, capped at 20 -> 20n
* max_parallel_tasks: rep=400 -> sqrt_floor(400)=20 -> 20n (exact cap)
* max_parallel_tasks: rep=399 -> sqrt_floor(399)=19 -> 19n (just below cap)
* rate_limit_bonus: rep=1 -> log2_floor(1)=0 -> 0n; rep=1024 -> log2_floor(1024)=10
* stake_discount: rep=10000 -> stake / 10 (discount kicks in); rep=1000 (floor) -> stake × 10
* can_arbitrate: arb=4999 -> false; arb=5000 + exec=2999 -> false; arb=5000 + exec=3000 -> true
* can_arbitrate: ban_until_epoch > current -> false even if scores qualify
* can_govern: gov=4000 -> true; gov=3999 -> false; gov=4000 + ban -> false
ACCEPTANCE CRITERIA (headline):
✓ 5 derivations: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern
✓ All use κ built-ins (sqrt_floor, log2_floor) and integer-math
✓ Ban check (ban_until_epoch > current_epoch) on the two boolean gates
✓ 3 boundary tests per derivation
✓ No Date.now, no Math.*
SUCCESS CHECK:
cd .worktrees/claude/p2-4-1-limits && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.4.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.4.1
branch: feature/p2-4-1-limits
worktree: .worktrees/claude/p2-4-1-limits
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: Derived-limits layer: max_parallel_tasks (sqrt_floor cap 20), rate_limit_bonus (log2_floor × base_rate), stake_discount (BPS × inverse rep with floor 1000), can_arbitrate (≥5000 arb + ≥3000 exec + not banned), can_govern (≥4000 + not banned). All κ-built-in derived; pure functions; current_epoch parameterized.
blockers: none")
FORBIDDENS:
✗ Do not call sqrt() from Math — use sqrt_floor from κ builtins
✗ Do not call Date.now — current_epoch is a parameter
✗ Do not write to DB
✗ Do not edit main checkout
NEXT:
P2.5.1 — Reputation Query MCP Tools (composes all 5 derivations into reputation_check_gates)
Verification checklist (for reviewer agent)
- All 5 derivations exported and tested at 3 boundaries each
Math.sqrt/Math.lognot used anywhereDate.nownot used anywhere- Ban check applied on
can_arbitrateandcan_govern npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.4.1
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.4.1
branch: feature/p2-4-1-limits
worktree: .worktrees/claude/p2-4-1-limits
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: Derived-limit pure functions: max_parallel_tasks, rate_limit_bonus, stake_discount, can_arbitrate, can_govern. All BPS math; all via κ built-ins (sqrt_floor, log2_floor). Ban gate enforced on booleans.
blockers: none
Common gotchas
- sqrt_floor input type — κ P1.3.2 expects bigint; convert
row.scoreto bigint at the boundary. The output is also bigint; the consumer (P2.5.1 tool) converts to number at the JSON boundary. - log2_floor(0) — undefined for 0; the spec says
max(rep, 1)for this reason. Test the boundary explicitly. - stake_discount floor at rep=1000 — this is to prevent runaway discounts at near-zero reputation. Do not change the floor without amending task-breakdown.md.
- Ban check semantics —
ban_until_epochis the first epoch the ban is over; the ban is active whenban_until_epoch > current_epoch. Off-by-one is easy here; test both sides explicitly.
P2.5.1 — Reputation Query MCP Tools
Spec source: task-breakdown.md §P2.5.1
Concept reference: reputation.md §Phase 0 posture (read-only tool surface)
Spec docs: s04-reputation.md §Computation (read-side surface)
Worktree: feature/p2-5-1-tools
Branch command: git worktree add .worktrees/claude/p2-5-1-tools -b feature/p2-5-1-tools origin/main
Estimated effort: S (Small — 1–2 hours; integration glue + 4 thin tool wrappers)
Depends on: P2.1.2 (score read), P2.2.1 (decay applied at read time), P2.2.2 (scar / ban exposed), P2.3.1 (token counts), P2.4.1 (gate booleans), P0.3.4 (ε tool registration)
Unblocks: closes λ Phase 2 (7/7)
Files to create
src/domains/reputation/tools.ts— 4 MCP tool definitionssrc/__tests__/domains/reputation/tools.test.ts— Integration tests (create node → apply events → verify tool output)
Acceptance criteria
reputation_get(node_id: string, domain?: Domain):- Returns
{ domain, score, scar_bps, ban_until_epoch, last_activity_epoch }for the specified domain (or array of all 5 if domain omitted). - Applies decay (P2.2.1) lazily before returning — score reflects current epoch.
- Score values returned as numbers (bps); ban as nullable number.
- Returns
reputation_history(node_id: string, domain: Domain, limit?: number, offset?: number):- Paginated history events; ordered by epoch DESC, then id DESC.
- Default
limit = 50, maxlimit = 500.
reputation_leaderboard(domain: Domain, limit?: number):- Top N nodes by current (decayed) score in domain.
- Default
limit = 100, maxlimit = 1000.
reputation_check_gates(node_id: string, current_epoch: number):- Returns
{ can_arbitrate, can_govern, max_parallel_tasks, rate_limit_bonus_factor, effective_stake_bps }. - Combines reads across all 5 domain rows for the node.
- Returns
- All 4 tools registered as MCP tools via ε Skill Registry (P0.6.x); registration glue in
src/server.tsif needed, matching the κ admission tool registration pattern (R87 P1.4.1). - Integration test: Create a node → write 5 positive events at varying epochs → apply decay across 10 epochs → query
reputation_get→ assert score matches hand-calculated value within 1 bps. - Zod schemas for inputs and outputs.
- Idempotent reads: no tool mutates the DB; reads do not advance
last_activity_epoch.
Pre-flight reading
CLAUDE.mdtask-breakdown.md §P2.5.1reputation.md§Phase 0 postures04-reputation.md§Computationsrc/domains/reputation/compute.ts(P2.1.2)src/domains/reputation/decay.ts(P2.2.1)src/domains/reputation/penalties.ts(P2.2.2)src/domains/reputation/tokens.ts(P2.3.1)src/domains/reputation/limits.ts(P2.4.1)src/domains/skills/(P0.6.x — tool registration)src/server.ts(registration pattern from κ P1.4.1 admission tools)
Ready-to-paste agent prompt
You are a Phase 2 builder agent for Colibri (λ Reputation).
TASK: P2.5.1 — Reputation Query MCP Tools
Final λ Phase 2 sub-task. Wire the 4 read-only tools that compose all prior λ outputs into the MCP surface.
FILES TO READ FIRST:
1. CLAUDE.md
2. docs/guides/implementation/task-breakdown.md §P2.5.1
3. docs/3-world/social/reputation.md §Phase 0 posture
4. docs/spec/s04-reputation.md §Computation
5. src/domains/reputation/{schema,compute,decay,penalties,tokens,limits}.ts (all upstream P2 outputs)
6. src/domains/skills/ (P0.6.x — tool registration glue)
7. src/server.ts (registration pattern from κ P1.4.1 admission tools)
WORKTREE SETUP:
git fetch origin
git worktree add .worktrees/claude/p2-5-1-tools -b feature/p2-5-1-tools origin/main
cd .worktrees/claude/p2-5-1-tools
FILES TO CREATE:
- src/domains/reputation/tools.ts
* Tool 1: reputation_get
- Input Zod: { node_id: string; domain?: Domain }
- Implementation: selectReputation(node_id, domain), apply_decay to each row, return projection
* Tool 2: reputation_history
- Input Zod: { node_id: string; domain: Domain; limit?: number (default 50, max 500); offset?: number }
- Implementation: selectHistory(node_id, domain, opts)
* Tool 3: reputation_leaderboard
- Input Zod: { domain: Domain; limit?: number (default 100, max 1000) }
- Implementation: SELECT * FROM reputations WHERE domain = ? ORDER BY score DESC LIMIT ?
(decay must be applied — since pre-computed decay would require a batch job, the
Phase 2 implementation applies decay in-memory after the SELECT; later phases may
materialize a decay job)
* Tool 4: reputation_check_gates
- Input Zod: { node_id: string; current_epoch: number }
- Implementation: read all 5 domain rows, call max_parallel_tasks/can_arbitrate/can_govern/rate_limit_bonus/stake_discount
* All 4 register via the skill registry (P0.6.x); registration happens at server boot per κ pattern in src/server.ts
* NO mutation tools — Phase 2 is read-only at the MCP surface
- src/__tests__/domains/reputation/tools.test.ts
* Integration: insert 5 history events at epochs 100–104 with deltas +1000, +500, +200, +800, +1500
→ reputation_get at epoch 104 → score matches hand-calc
→ reputation_get at epoch 200 → score reflects 96 epochs of execution-domain decay (500bps/epoch)
* reputation_history: 100 events → page 1 returns 50 ordered DESC, page 2 returns next 50
* reputation_leaderboard: 10 nodes → top 3 returned in DESC score order
* reputation_check_gates: known scores → gates match P2.4.1 derivations
* No tool mutates DB: assert reputation_get does not change reputations or reputation_history
* Zod rejects: invalid domain, limit > max, negative offset
ACCEPTANCE CRITERIA (headline):
✓ 4 tools registered as MCP tools via ε
✓ Decay applied lazily on every read
✓ Zod input/output schemas
✓ Limit clamping (max enforced)
✓ Integration test: write events → read score → hand-calc match
✓ Read-only — no DB mutation in any of the 4 tools
SUCCESS CHECK:
cd .worktrees/claude/p2-5-1-tools && npm run build && npm run lint && npm test
WRITEBACK (after success):
task_update(id="P2.5.1", status="done", progress=100)
thought_record(session_id="r91-lambda-phase-2", thought_type="reflection",
content="task_id: P2.5.1
branch: feature/p2-5-1-tools
worktree: .worktrees/claude/p2-5-1-tools
commit: <SHA>
tests: npm run build && npm run lint && npm test
summary: 4 read-only MCP tools (reputation_get, reputation_history, reputation_leaderboard, reputation_check_gates) registered via ε. Decay applied lazily on every read. Integration test: hand-calc score matches across 96-epoch decay span. Closes λ Phase 2 at 7/7.
blockers: none")
FORBIDDENS:
✗ Do not add mutation tools — Phase 2 is read-only at the MCP surface
✗ Do not use Date.now — current_epoch is a Zod-validated input
✗ Do not skip decay on read — score must reflect current_epoch
✗ Do not edit main checkout
NEXT:
λ Phase 2 closes at 7/7. PM updates memory file with λ status partial,
then opens the colibri_code: partial graduation hygiene PR per ADR-006.
Verification checklist (for reviewer agent)
- All 4 tools register via the ε skill registry pattern
- No
INSERT/UPDATE/DELETESQL anywhere intools.ts - Decay applied on every score-returning read
- Limit clamp enforced (test with
limit = 10000rejected or clamped) - Integration test exists and hand-calc matches
- Zod schemas validate input edges (negative offset, invalid domain, etc.)
npm run build && npm run lint && npm testgreen
Writeback template
task_update:
task_id: P2.5.1
status: done
progress: 100
thought_record:
session_id: r91-lambda-phase-2
thought_type: reflection
content: |
task_id: P2.5.1
branch: feature/p2-5-1-tools
worktree: .worktrees/claude/p2-5-1-tools
commit: <sha>
tests: npm run build && npm run lint && npm test
summary: Closes λ Phase 2 at 7/7. Four read-only MCP tools (reputation_get, reputation_history, reputation_leaderboard, reputation_check_gates) registered via ε. Lazy decay on every score read. Integration test verifies write→decay→read against hand-calc.
blockers: none
Common gotchas
- Decay-on-read is O(N) per leaderboard call — for
reputation_leaderboard(domain, 1000), that’s 1000 decays per call. Acceptable for Phase 2 (single-actor posture); a decay-materialization job is a Phase 6+ optimization. Document this in the PR. current_epochsource — never useDate.now(). The tool takes it as a Zod-validated input. Forreputation_check_gates, the caller (e.g. admission middleware in κ) supplies the current epoch fromsrc/domains/rules/admission.ts’s epoch field.- Tool registration pattern — match the κ admission tool registration in
src/server.tsexactly. Grep foradmission_evaluateand pattern-match the registration block. - Zod output schemas — even though MCP doesn’t strictly require output schemas, define them. They serve as the contract for downstream consumers and the integration-test assertions.
- Number vs bigint at the JSON boundary —
scoreis bigint internally, serialized as a JSON number (bps; 0–10000 range is well withinNumber.MAX_SAFE_INTEGER). Document this conversion.
See also
- agent-bootstrap.md — Master bootstrap prompt (read FIRST)
- task-breakdown.md — Canonical 63-task breakdown (all phases)
- reputation.md — λ concept doc
- s04-reputation.md — authoritative reputation spec
- s05-experience-tokens.md — authoritative experience-token spec
- s09-arbitration.md — arbitration spec (λ-coupling read-only)
- p1.1-kappa-rule-engine.md — κ Phase 1 prompts (upstream dependency)
- PHASE-0-EXECUTION-GUIDE.md — Phase 0 entry point reference
R89.B λ Phase 2 Staging — 2026-05-12. 7 sub-task prompts authored against base fab4bf57. Phase 2 implementation officially starts at R91 per roadmap.md.