P2.5.1 — Reputation Query MCP Tools — Behavioral Contract

Slice: p2-5-1-tools — closes λ Phase 2 at 7/7 Audit: docs/audits/p2-5-1-tools-audit.md @ f8dc2f62 Public surface: src/domains/reputation/tools.ts exporting registerReputationTools(ctx) + 4 internal handler functions (exported for test access)

1. Public surface

// src/domains/reputation/tools.ts
import type { ColibriServerContext } from '../../server.js';

export function registerReputationTools(ctx: ColibriServerContext): void;

// Exported for test access (handlers are pure given a `db` parameter)
export function reputationGet(db: Database, input: ReputationGetInput): ReputationGetOutput;
export function reputationHistory(db: Database, input: ReputationHistoryInput): ReputationHistoryOutput;
export function reputationLeaderboard(db: Database, input: ReputationLeaderboardInput): ReputationLeaderboardOutput;
export function reputationCheckGates(db: Database, input: ReputationCheckGatesInput): ReputationCheckGatesOutput;

2. Tool 1 — `reputation_get`

2.1 Input schema (Zod)

z.object({
  node_id: z.string().min(1),
  domain: DomainSchema.optional(),          // one of 5 canonical; omit for "all 5"
  current_epoch: z.number().int().nonnegative(),
}).strict()

2.2 Output

When domain is provided:

{ row: { node_id, domain, score, scar_bps, ban_until_epoch, last_activity_epoch } | null }

null ⟺ no row exists at (node_id, domain).
All numeric fields are number (bps).
score is the decayed value at current_epoch; last_activity_epoch is unchanged from the DB row (it is never advanced by reads).

When domain is omitted:

{ rows: ReputationRow[] }

Length ∈ {0, 1, 2, 3, 4, 5} depending on how many rows exist for node_id.
Order is domain ASC (matches selectReputation reader DB ordering).
Every returned score has been decayed via apply_decay.

2.3 Behavior

rows := selectReputation(db, node_id, domain?)   // P2.1.1 reader
if domain provided AND rows is null:
  return { row: null }
if domain provided AND rows is ReputationRow:
  return { row: apply_decay(rows, BigInt(current_epoch)) }
if domain omitted:
  return { rows: apply_decay_batch(rows, BigInt(current_epoch)) }

2.4 Invariants

I-G1 No SQL INSERT / UPDATE / DELETE executed.
I-G2 last_activity_epoch field in the returned payload equals the stored DB value (never mutated by this tool, even when the stored value is stale).
I-G3 score field is in [0, 10000] after decay (decay primitive can only reduce or hold score, per decay.ts:I3).
I-G4 Deterministic — two consecutive calls with identical input yield byte-identical output.
I-G5 When BigInt(current_epoch) < BigInt(row.last_activity_epoch) (clock skew), the row is returned with score unchanged (decay short-circuits per decay.ts:124).

3. Tool 2 — `reputation_history`

3.1 Input schema (Zod)

z.object({
  node_id: z.string().min(1),
  domain: DomainSchema,
  limit: z.number().int().min(1).max(500).optional(),   // default 50
  offset: z.number().int().nonnegative().optional(),    // default 0
}).strict()

3.2 Output

{ events: ReputationHistoryRow[] }

Empty array if no rows match.
Ordered epoch DESC, id DESC (matches selectHistory reader).
Each event carries { id, node_id, domain, epoch, delta, reason, event_id }.

3.3 Behavior

return { events: selectHistory(db, node_id, domain, { limit: limit ?? 50, offset: offset ?? 0 }) }

The reader (schema.ts:248) already clamps limit to min(1000, requestedLimit) internally, but the Zod schema caps the externally-visible max at 500 per source prompt §P2.5.1 acceptance criteria.

3.4 Invariants

I-H1 No SQL INSERT / UPDATE / DELETE executed.
I-H2 Zod rejects limit > 500, limit < 1, offset < 0, unknown domains, empty node_id (per .strict() rejection).
I-H3 (epoch DESC, id DESC) ordering is total and deterministic (matches selectHistory SQL ORDER BY epoch DESC, id DESC).
I-H4 Page 1 + Page 2 are disjoint when total events ≥ 100 + offset=50; ordered concatenation yields the first 100 events in canonical order.

4. Tool 3 — `reputation_leaderboard`

4.1 Input schema (Zod)

z.object({
  domain: DomainSchema,
  limit: z.number().int().min(1).max(1000).optional(),  // default 100
  current_epoch: z.number().int().nonnegative(),
}).strict()

4.2 Output

{ rows: ReputationRow[] }

Length ≤ limit ?? 100.
Ordered by decayed score DESC. Ties broken by node_id ASC (deterministic).
Each row is a decay-applied snapshot at current_epoch.

4.3 Behavior

const lim = limit ?? 100;
const overshoot = Math.min(lim * 2, 200);                       // O(1) overshoot to limit decayed-out misses
const candidates = db.prepare(
  `SELECT node_id, domain, score, scar_bps, ban_until_epoch, last_activity_epoch
   FROM reputations
   WHERE domain = ?
   ORDER BY score DESC
   LIMIT ?`
).all(domain, overshoot) as ReputationRow[];
const decayed = apply_decay_batch(candidates, BigInt(current_epoch));
decayed.sort((a, b) => {
  if (b.score !== a.score) return b.score - a.score;
  return a.node_id < b.node_id ? -1 : a.node_id > b.node_id ? 1 : 0;
});
return { rows: decayed.slice(0, lim) };

4.4 Invariants

I-L1 No SQL INSERT / UPDATE / DELETE executed.
I-L2 Output length ≤ limit ?? 100 ≤ 1000.
I-L3 Output ordering: score DESC strict, node_id ASC for ties.
I-L4 O(N) memory + O(N log N) sort cost where N = min(2 × limit, 200). Acceptable for Phase 2 single-actor posture per source-prompt §11 gotcha. Document the materialization-deferred caveat in PR.
I-L5 The leaderboard uses the existing idx_reputations_leaderboard ON (domain, score DESC) index (P2.1.1).

5. Tool 4 — `reputation_check_gates`

5.1 Input schema (Zod)

z.object({
  node_id: z.string().min(1),
  current_epoch: z.number().int().nonnegative(),
}).strict()

5.2 Output

{
  can_arbitrate: boolean,
  can_govern: boolean,
  max_parallel_tasks: number,        // 0..20
  rate_limit_bonus_factor: number,   // bps (the 1.00x-base multiplier; consumer scales their own base rate)
  effective_stake_bps: number,        // effective stake when input = 10000 bps (1.00x)
}

5.3 Behavior

const all = selectReputation(db, node_id);  // ReputationRow[] (possibly empty)
const byDomain = new Map<Domain, ReputationRow>(all.map(r => [r.domain, r]));
const fallback = (d: Domain): ReputationRow => ({
  node_id, domain: d, score: 0, scar_bps: 0,
  ban_until_epoch: null, last_activity_epoch: current_epoch,
});
const rep_arb  = byDomain.get('arbitration') ?? fallback('arbitration');
const rep_gov  = byDomain.get('governance')  ?? fallback('governance');
const rep_exec = byDomain.get('execution')   ?? fallback('execution');
const e = BigInt(current_epoch);
return {
  can_arbitrate: can_arbitrate(rep_arb, rep_exec, e),
  can_govern:    can_govern(rep_gov, e),
  max_parallel_tasks:      Number(max_parallel_tasks(rep_exec)),
  rate_limit_bonus_factor: Number(rate_limit_bonus(rep_exec, BPS_100_PERCENT)),
  effective_stake_bps:      Number(stake_discount(BPS_100_PERCENT, rep_exec)),
};

5.4 Invariants

I-CG1 No SQL INSERT / UPDATE / DELETE executed.
I-CG2 Output composition is bit-for-bit identical to invoking the P2.4.1 limits functions directly with the same row inputs.
I-CG3 Missing-row default: all 5 fallback rows have score=0, scar_bps=0, ban_until_epoch=null ⟹ can_arbitrate=false, can_govern=false, max_parallel_tasks=0, rate_limit_bonus_factor=0, effective_stake_bps=10000 (10× stake, the floor multiplier).
I-CG4 max_parallel_tasks ∈ [0, 20] (P2.4.1 hard cap).
I-CG5 can_arbitrate ⟹ rep_arb.score ≥ 5000 ∧ rep_exec.score ≥ 3000 ∧ ¬banned(rep_arb, e) (P2.4.1 contract).
I-CG6 can_govern ⟹ rep_gov.score ≥ 4000 ∧ ¬banned(rep_gov, e) (P2.4.1 contract).
I-CG7 Decay is NOT pre-applied to the rows passed to can_arbitrate / can_govern / max_parallel_tasks. P2.4.1 contract §3 specifies these gates take stored scores. Future Phase 3+ may revise this.

6. Registration glue

// src/server.ts (bootstrap()), AFTER registerMerkleTools(ctx):
registerReputationTools(ctx);

registerReputationTools body:

export function registerReputationTools(ctx: ColibriServerContext): void {
  registerColibriTool(ctx, 'reputation_get',         { title, description, inputSchema: GetIn },         (input) => reputationGet(getDb(), input));
  registerColibriTool(ctx, 'reputation_history',     { title, description, inputSchema: HistoryIn },     (input) => reputationHistory(getDb(), input));
  registerColibriTool(ctx, 'reputation_leaderboard', { title, description, inputSchema: LeaderboardIn }, (input) => reputationLeaderboard(getDb(), input));
  registerColibriTool(ctx, 'reputation_check_gates', { title, description, inputSchema: CheckGatesIn }, (input) => reputationCheckGates(getDb(), input));
}

DB resolution is lazy at call-time via getDb() per the canonical Phase-2-startup contract (matches src/domains/trail/repository.ts:419).

7. Error semantics

All four tools follow the s17 envelope contract — handlers return raw data; α middleware wraps to { ok: true, data } on success or { ok: false, error: { code, message, details } } on failure. The handlers themselves only throw when:

getDb() is called pre-Phase-2 (DB not initialized) — surfaces as a HANDLER_ERROR envelope. In production this is unreachable because bootstrap() registers tools before start(ctx) and the DB opens before any tool call arrives.
A defensive never-narrowed switch in a downstream P2.4.1 helper hits an impossible 6th-domain string — also unreachable because Zod’s DomainSchema blocks bad inputs at stage 2.

No tool throws on missing rows, empty histories, decayed-out scores, or zero-rep nodes — those are returned as null / [] / 0-value gates per the per-tool sections above.

8. Lint / determinism posture

tools.ts may import from node:crypto (none needed in current design) but must not use Date.*, Math.random(), setTimeout, or setInterval per CLAUDE.md §5.
All time arithmetic uses caller-supplied current_epoch (bigint after the BigInt(...) bridge at the function boundary).
apply_decay_batch is the documented batch path; we do not unwind its internals.
All 4 handler functions are synchronous (return raw data, no Promise). Matches src/tools/health.ts:226 posture.

9. Type matrix (TS exports)

export interface ReputationGetInput {
  readonly node_id: string;
  readonly current_epoch: number;
  readonly domain?: Domain;
}
export type ReputationGetOutput =
  | { readonly row: ReputationRow | null }
  | { readonly rows: ReputationRow[] };

export interface ReputationHistoryInput {
  readonly node_id: string;
  readonly domain: Domain;
  readonly limit?: number;
  readonly offset?: number;
}
export interface ReputationHistoryOutput {
  readonly events: ReputationHistoryRow[];
}

export interface ReputationLeaderboardInput {
  readonly domain: Domain;
  readonly current_epoch: number;
  readonly limit?: number;
}
export interface ReputationLeaderboardOutput {
  readonly rows: ReputationRow[];
}

export interface ReputationCheckGatesInput {
  readonly node_id: string;
  readonly current_epoch: number;
}
export interface ReputationCheckGatesOutput {
  readonly can_arbitrate: boolean;
  readonly can_govern: boolean;
  readonly max_parallel_tasks: number;
  readonly rate_limit_bonus_factor: number;
  readonly effective_stake_bps: number;
}

10. Acceptance criteria (executor self-check)

#	Criterion	Verification
AC-1	4 tools registered via ε at boot	unit test asserts `ctx._registeredToolNames.has('reputation_get'/...)` after `registerReputationTools`
AC-2	Zod schemas reject bad inputs	direct schema `.safeParse` tests on bad domain / negative offset / oversized limit / missing node_id / negative current_epoch
AC-3	Decay applied lazily on every score-returning read	integration test: insert row at epoch 100 score 8000 (execution) → read at 100 score=8000 → read at 196 score < 8000 (matches `decay(8000n, 500n, 96n)`)
AC-4	History DESC ordering correct	insert 100 events; read default → first event has highest epoch
AC-5	Limit clamp (max enforced)	`limit=10000` rejected by Zod (`reputation_history.limit > 500` AND `reputation_leaderboard.limit > 1000`)
AC-6	Leaderboard reflects decayed score order	insert nodes with different `last_activity_epoch`; verify ordering flips correctly after decay
AC-7	`reputation_check_gates` composes P2.4.1 derivations	hand-craft scenario (rep_arb=5000, rep_exec=3000, ban_until_epoch=null) → all 5 outputs match the P2.4.1 contract derivations
AC-8	Read-only — no DB mutation	row + history counts unchanged after every tool invocation
AC-9	grep — `tools.ts` contains no `INSERT`/`UPDATE`/`DELETE` SQL	source self-scan in test
AC-10	`npm run build && npm run lint && npm test` green	gate before PR

11. Phase posture

After this PR merges, λ axis graduates from colibri_code: none to colibri_code: partial per ADR-006 (Phase 2 of multi-phase spec — exactly the same posture κ had post-R87). Greek concepts shipping code: 10/15 (α β γ δ ε ζ η κ λ ν). Frontmatter graduation for docs/3-world/social/reputation.md is a separate hygiene PR not in scope here (matches κ’s R85 staging).