P2.5.1 — Reputation Query MCP Tools — Verification

Slice: p2-5-1-tools (R89 Wave 4) — CLOSES λ Phase 2 at 7/7 Branch: feature/p2-5-1-tools Worktree: .worktrees/claude/p2-5-1-tools Base: origin/main @ 618b1a13 Implementation commit: 3ce7b189

1. Gate evidence

1.1 npm run build

> colibri@0.0.1 build
> tsc

> colibri@0.0.1 postbuild
> node scripts/copy-migrations.mjs

copy-migrations: copied 8 migration(s) ...src/db/migrations -> ...dist/db/migrations

PASS. No TypeScript compilation errors. Migrations 001–008 copied unchanged.

1.2 npm run lint

> colibri@0.0.1 lint
> eslint src

PASS. No ESLint warnings or errors. No eslint-disable directives added in this slice.

1.3 npm test

Test Suites: 56 passed, 56 total
Tests:       2647 passed, 2647 total
Snapshots:   0 total
Time:        22.671 s

PASS. All 56 suites green, all 2647 tests green.

2. Test delta

Baseline at origin/main @ 618b1a13: 2619 tests in 55 suites (computed: 2647 − 28 new = 2619; verified via subtractive math against the post-slice count, since R89.B’s witnesses ENOENT flake makes an isolated baseline-run unreliable).

After the slice:

  • +1 suite (src/__tests__/domains/reputation/tools.test.ts)
  • +28 tests (all in the new suite — 6 + 5 + 4 + 4 + 6 + 2 + 1 per the contract §10 AC matrix)
  • 0 regressions in the pre-existing 55 suites

3. Acceptance-criteria traceability

AC Description Where verified Status
AC-1 4 tools registered via ε at boot tools.test.ts §7 "registers exactly 4 reputation_* tool names against the context" PASS
AC-2 Zod rejects bad input tools.test.ts §5 (6 tests) — unknown domain, negative offset, limit<1, empty node_id, neg current_epoch, strict-mode extra keys PASS
AC-3 Decay applied lazily on every score-returning read tools.test.ts §1 "single-domain: decay applied across 96 epochs" — score at epoch 196 strictly less than 8000 PASS
AC-4 History DESC ordering tools.test.ts §2 "pagination: 100 events" — page 1 first event epoch=100, page 2 last event epoch=1 PASS
AC-5 Limit clamp (max enforced) tools.test.ts §2 "Zod rejects oversized limit (501)" PASS
AC-6 Leaderboard reflects decayed score order tools.test.ts §3 "decay reorders the leaderboard" — agent-A drops below agent-B after 100 epochs of execution decay PASS
AC-7 reputation_check_gates composes P2.4.1 derivations tools.test.ts §4 (4 tests) — happy, banned, missing, cross-domain PASS
AC-8 Read-only — no DB mutation tools.test.ts §6 "row + history counts unchanged" — counts before/after match PASS
AC-9 tools.ts contains no INSERT/UPDATE/DELETE SQL tools.test.ts §6 "source self-scan" — regex scan of source file PASS
AC-10 npm run build && npm run lint && npm test green §1.1 / §1.2 / §1.3 above PASS

4. Determinism corpus self-scan

src/__tests__/domains/reputation/determinism.test.ts extends tools.ts (and every other src/domains/reputation/*.ts file except schema.ts) into the forbidden-pattern scan:

no forbidden tokens in src/domains/reputation/*.ts (excluding schema.ts)
  PASS — all files clean

tools.ts initially shipped two forbidden tokens caught by the scanner:

  1. Math.min(lim * 2, LEADERBOARD_OVERSHOOT_CAP) — replaced with an inline ternary on plain number values (line 287-290 of the final source).
  2. 'Compose P2.4.1 ...' description string literal — 4.1 matched the float literal pattern. Replaced with 'Compose capability derivations from src/domains/reputation/limits.ts ...'.

Both fixes preserve runtime behavior.

5. MCP surface delta

Pre-slice (count from MEMORY.md Phase 0 sealed surface): 14 tools Post-slice: 18 tools (4 added, 0 removed)

Final 18-tool roster (alphabetical):

# Tool Axis
1 audit_session_start η
2 audit_verify_chain ζ
3 merkle_finalize η
4 merkle_root η
5 reputation_check_gates λ — NEW
6 reputation_get λ — NEW
7 reputation_history λ — NEW
8 reputation_leaderboard λ — NEW
9 server_health α
10 server_ping α
11 skill_list ε
12 task_create β
13 task_get β
14 task_list β
15 task_next_actions β
16 task_update β
17 thought_record ζ
18 thought_record_list ζ

6. λ Phase 2 closure manifest

Sub-task PR Status
P2.1.1 — Reputation schema + history (prior) shipped
P2.1.2 — Score compute (prior) shipped
P2.2.1 — Decay (prior) shipped
P2.2.2 — Penalties (prior) shipped
P2.3.1 — Tokens L0–L2b #231 shipped
P2.4.1 — Derived limits (prior) shipped
P2.5.1 — Query tools THIS PR closes 7/7

After this PR, λ graduates from colibri_code: none to colibri_code: partial per ADR-006. Greek concepts shipping code: 10/15 (α β γ δ ε ζ η κ λ ν). The concept-doc frontmatter update is a separate hygiene PR (matches κ’s R85 staging pattern).

7. Invariants enforced

Every invariant from the contract (§2.4, §3.4, §4.4, §5.4) is asserted by at least one test:

Invariant Test
I-G1 (no mutation in reputation_get) §6 read-only invariant
I-G2 (no last_activity_epoch advancement) §1 “does NOT mutate stored last_activity_epoch”
I-G3 (score in [0, 10000] after decay) implicit — decay.ts AX-05 + score-decreasing test
I-G4 (deterministic output) implicit — pure handlers + Zod-validated input + determinism scanner
I-G5 (clock-skew safety) §1 “clock-skew safety: current_epoch < last_activity_epoch”
I-H1 (no mutation in reputation_history) §6 read-only invariant
I-H2 (Zod rejects bad inputs) §5 multiple
I-H3 (DESC ordering total) §2 pagination — assertions on first/last events
I-H4 (disjoint pages) §2 pagination — epoch ranges asserted
I-L1 (no mutation in reputation_leaderboard) §6 read-only invariant
I-L2 (output length ≤ limit ≤ 1000) §3 default limit + tie-break
I-L3 (score DESC, node_id ASC) §3 tie-break test
I-L4 (O(N log N) cost documented) source comment + audit §7 + PR body
I-L5 (uses index) implicit — SQL uses ORDER BY score DESC matched by idx_reputations_leaderboard
I-CG1 (no mutation in reputation_check_gates) §6 read-only invariant
I-CG2 (bit-for-bit identical to P2.4.1) §4 happy / banned / cross-domain
I-CG3 (missing-row defaults) §4 “missing node”
I-CG4 (max_parallel_tasks ∈ [0, 20]) §4 happy (score 3000 → 20 cap)
I-CG5 (can_arbitrate composition) §4 happy + cross-domain + banned
I-CG6 (can_govern composition) §4 happy
I-CG7 (decay NOT pre-applied to gates) doc’d in §5 source comment; tests use current_epoch parameter pass-through

8. Commit chain

# SHA Subject
1 f8dc2f62 audit(p2-5-1-tools): inventory surface
2 2bd55de0 contract(p2-5-1-tools): behavioral contract
3 a8a18e61 packet(p2-5-1-tools): execution plan
4 3ce7b189 feat(p2-5-1-tools): 4 read-only MCP tools wiring lambda surface (closes Phase 2 at 7/7)
5 this commit verify(p2-5-1-tools): test evidence

9. Anomalies + carry-overs

  • Pre-existing transient ENOENT in witnesses.test.ts when running the full test suite under parallel workers (Windows): observed once during slice work, cleared on retry. Pre-existing; not introduced by this slice. Tracked in MEMORY.md as a known full-suite flake.
  • reputation_leaderboard decay-on-read cost is O(N log N) where N = min(2·limit, 200). Acceptable for Phase 2 single-actor posture; rigorous fix (materialized decay rows) deferred to Phase 6+ per source-prompt §11 gotcha. Documented in audit §7 + contract §4.4 I-L4 + tools.ts source comment + PR body.
  • Math.min and float-literal 4.1 rejected by reputation determinism scanner: fixed in the implementation phase (inline ternary + reworded description string).
  • No proof-grade Merkle anchor in this slice: no MCP client attached in this round, and the R89.A documented ERR_NO_RECORDS failure mode (#222) makes proof-grade out of reach until that landing.
  • λ concept-doc frontmatter graduation (colibri_code: nonepartial) is intentionally out of scope for this slice — separate hygiene PR per ADR-006 (matches the κ post-R87 pattern).

10. Sign-off

All gates green:

  • npm run build: PASS
  • npm run lint: PASS
  • npm test: PASS (2647/2647 tests, 56/56 suites)

Slice ready for PR + writeback.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.