P2.5.1 — Reputation Query MCP Tools — Behavioral Contract
Slice: p2-5-1-tools — closes λ Phase 2 at 7/7
Audit: docs/audits/p2-5-1-tools-audit.md @ f8dc2f62
Public surface: src/domains/reputation/tools.ts exporting registerReputationTools(ctx) + 4 internal handler functions (exported for test access)
1. Public surface
// src/domains/reputation/tools.ts
import type { ColibriServerContext } from '../../server.js';
export function registerReputationTools(ctx: ColibriServerContext): void;
// Exported for test access (handlers are pure given a `db` parameter)
export function reputationGet(db: Database, input: ReputationGetInput): ReputationGetOutput;
export function reputationHistory(db: Database, input: ReputationHistoryInput): ReputationHistoryOutput;
export function reputationLeaderboard(db: Database, input: ReputationLeaderboardInput): ReputationLeaderboardOutput;
export function reputationCheckGates(db: Database, input: ReputationCheckGatesInput): ReputationCheckGatesOutput;
2. Tool 1 — reputation_get
2.1 Input schema (Zod)
z.object({
node_id: z.string().min(1),
domain: DomainSchema.optional(), // one of 5 canonical; omit for "all 5"
current_epoch: z.number().int().nonnegative(),
}).strict()
2.2 Output
When domain is provided:
{ row: { node_id, domain, score, scar_bps, ban_until_epoch, last_activity_epoch } | null }
null⟺ no row exists at(node_id, domain).- All numeric fields are
number(bps). scoreis the decayed value atcurrent_epoch;last_activity_epochis unchanged from the DB row (it is never advanced by reads).
When domain is omitted:
{ rows: ReputationRow[] }
- Length ∈ {0, 1, 2, 3, 4, 5} depending on how many rows exist for
node_id. - Order is
domain ASC(matchesselectReputationreader DB ordering). - Every returned
scorehas been decayed viaapply_decay.
2.3 Behavior
rows := selectReputation(db, node_id, domain?) // P2.1.1 reader
if domain provided AND rows is null:
return { row: null }
if domain provided AND rows is ReputationRow:
return { row: apply_decay(rows, BigInt(current_epoch)) }
if domain omitted:
return { rows: apply_decay_batch(rows, BigInt(current_epoch)) }
2.4 Invariants
- I-G1 No SQL
INSERT/UPDATE/DELETEexecuted. - I-G2
last_activity_epochfield in the returned payload equals the stored DB value (never mutated by this tool, even when the stored value is stale). - I-G3
scorefield is in[0, 10000]after decay (decay primitive can only reduce or hold score, perdecay.ts:I3). - I-G4 Deterministic — two consecutive calls with identical input yield byte-identical output.
- I-G5 When
BigInt(current_epoch) < BigInt(row.last_activity_epoch)(clock skew), the row is returned withscoreunchanged (decay short-circuits perdecay.ts:124).
3. Tool 2 — reputation_history
3.1 Input schema (Zod)
z.object({
node_id: z.string().min(1),
domain: DomainSchema,
limit: z.number().int().min(1).max(500).optional(), // default 50
offset: z.number().int().nonnegative().optional(), // default 0
}).strict()
3.2 Output
{ events: ReputationHistoryRow[] }
- Empty array if no rows match.
- Ordered
epoch DESC, id DESC(matchesselectHistoryreader). - Each event carries
{ id, node_id, domain, epoch, delta, reason, event_id }.
3.3 Behavior
return { events: selectHistory(db, node_id, domain, { limit: limit ?? 50, offset: offset ?? 0 }) }
The reader (schema.ts:248) already clamps limit to min(1000, requestedLimit) internally, but the Zod schema caps the externally-visible max at 500 per source prompt §P2.5.1 acceptance criteria.
3.4 Invariants
- I-H1 No SQL
INSERT/UPDATE/DELETEexecuted. - I-H2 Zod rejects
limit > 500,limit < 1,offset < 0, unknown domains, emptynode_id(per.strict()rejection). - I-H3
(epoch DESC, id DESC)ordering is total and deterministic (matchesselectHistorySQLORDER BY epoch DESC, id DESC). - I-H4 Page 1 + Page 2 are disjoint when total events ≥ 100 + offset=50; ordered concatenation yields the first 100 events in canonical order.
4. Tool 3 — reputation_leaderboard
4.1 Input schema (Zod)
z.object({
domain: DomainSchema,
limit: z.number().int().min(1).max(1000).optional(), // default 100
current_epoch: z.number().int().nonnegative(),
}).strict()
4.2 Output
{ rows: ReputationRow[] }
- Length ≤
limit ?? 100. - Ordered by decayed
score DESC. Ties broken bynode_id ASC(deterministic). - Each row is a decay-applied snapshot at
current_epoch.
4.3 Behavior
const lim = limit ?? 100;
const overshoot = Math.min(lim * 2, 200); // O(1) overshoot to limit decayed-out misses
const candidates = db.prepare(
`SELECT node_id, domain, score, scar_bps, ban_until_epoch, last_activity_epoch
FROM reputations
WHERE domain = ?
ORDER BY score DESC
LIMIT ?`
).all(domain, overshoot) as ReputationRow[];
const decayed = apply_decay_batch(candidates, BigInt(current_epoch));
decayed.sort((a, b) => {
if (b.score !== a.score) return b.score - a.score;
return a.node_id < b.node_id ? -1 : a.node_id > b.node_id ? 1 : 0;
});
return { rows: decayed.slice(0, lim) };
4.4 Invariants
- I-L1 No SQL
INSERT/UPDATE/DELETEexecuted. - I-L2 Output length ≤
limit ?? 100≤ 1000. - I-L3 Output ordering:
score DESCstrict,node_id ASCfor ties. - I-L4 O(N) memory + O(N log N) sort cost where N =
min(2 × limit, 200). Acceptable for Phase 2 single-actor posture per source-prompt §11 gotcha. Document the materialization-deferred caveat in PR. - I-L5 The leaderboard uses the existing
idx_reputations_leaderboard ON (domain, score DESC)index (P2.1.1).
5. Tool 4 — reputation_check_gates
5.1 Input schema (Zod)
z.object({
node_id: z.string().min(1),
current_epoch: z.number().int().nonnegative(),
}).strict()
5.2 Output
{
can_arbitrate: boolean,
can_govern: boolean,
max_parallel_tasks: number, // 0..20
rate_limit_bonus_factor: number, // bps (the 1.00x-base multiplier; consumer scales their own base rate)
effective_stake_bps: number, // effective stake when input = 10000 bps (1.00x)
}
5.3 Behavior
const all = selectReputation(db, node_id); // ReputationRow[] (possibly empty)
const byDomain = new Map<Domain, ReputationRow>(all.map(r => [r.domain, r]));
const fallback = (d: Domain): ReputationRow => ({
node_id, domain: d, score: 0, scar_bps: 0,
ban_until_epoch: null, last_activity_epoch: current_epoch,
});
const rep_arb = byDomain.get('arbitration') ?? fallback('arbitration');
const rep_gov = byDomain.get('governance') ?? fallback('governance');
const rep_exec = byDomain.get('execution') ?? fallback('execution');
const e = BigInt(current_epoch);
return {
can_arbitrate: can_arbitrate(rep_arb, rep_exec, e),
can_govern: can_govern(rep_gov, e),
max_parallel_tasks: Number(max_parallel_tasks(rep_exec)),
rate_limit_bonus_factor: Number(rate_limit_bonus(rep_exec, BPS_100_PERCENT)),
effective_stake_bps: Number(stake_discount(BPS_100_PERCENT, rep_exec)),
};
5.4 Invariants
- I-CG1 No SQL
INSERT/UPDATE/DELETEexecuted. - I-CG2 Output composition is bit-for-bit identical to invoking the P2.4.1 limits functions directly with the same row inputs.
- I-CG3 Missing-row default: all 5 fallback rows have
score=0,scar_bps=0,ban_until_epoch=null⟹can_arbitrate=false, can_govern=false, max_parallel_tasks=0, rate_limit_bonus_factor=0, effective_stake_bps=10000(10× stake, the floor multiplier). - I-CG4
max_parallel_tasks ∈ [0, 20](P2.4.1 hard cap). - I-CG5
can_arbitrate⟹rep_arb.score ≥ 5000 ∧ rep_exec.score ≥ 3000 ∧ ¬banned(rep_arb, e)(P2.4.1 contract). - I-CG6
can_govern⟹rep_gov.score ≥ 4000 ∧ ¬banned(rep_gov, e)(P2.4.1 contract). - I-CG7 Decay is NOT pre-applied to the rows passed to
can_arbitrate/can_govern/max_parallel_tasks. P2.4.1 contract §3 specifies these gates take stored scores. Future Phase 3+ may revise this.
6. Registration glue
// src/server.ts (bootstrap()), AFTER registerMerkleTools(ctx):
registerReputationTools(ctx);
registerReputationTools body:
export function registerReputationTools(ctx: ColibriServerContext): void {
registerColibriTool(ctx, 'reputation_get', { title, description, inputSchema: GetIn }, (input) => reputationGet(getDb(), input));
registerColibriTool(ctx, 'reputation_history', { title, description, inputSchema: HistoryIn }, (input) => reputationHistory(getDb(), input));
registerColibriTool(ctx, 'reputation_leaderboard', { title, description, inputSchema: LeaderboardIn }, (input) => reputationLeaderboard(getDb(), input));
registerColibriTool(ctx, 'reputation_check_gates', { title, description, inputSchema: CheckGatesIn }, (input) => reputationCheckGates(getDb(), input));
}
DB resolution is lazy at call-time via getDb() per the canonical Phase-2-startup contract (matches src/domains/trail/repository.ts:419).
7. Error semantics
All four tools follow the s17 envelope contract — handlers return raw data; α middleware wraps to { ok: true, data } on success or { ok: false, error: { code, message, details } } on failure. The handlers themselves only throw when:
getDb()is called pre-Phase-2 (DB not initialized) — surfaces as aHANDLER_ERRORenvelope. In production this is unreachable becausebootstrap()registers tools beforestart(ctx)and the DB opens before any tool call arrives.- A defensive
never-narrowed switch in a downstream P2.4.1 helper hits an impossible 6th-domain string — also unreachable because Zod’sDomainSchemablocks bad inputs at stage 2.
No tool throws on missing rows, empty histories, decayed-out scores, or zero-rep nodes — those are returned as null / [] / 0-value gates per the per-tool sections above.
8. Lint / determinism posture
tools.tsmay import fromnode:crypto(none needed in current design) but must not useDate.*,Math.random(),setTimeout, orsetIntervalper CLAUDE.md §5.- All time arithmetic uses caller-supplied
current_epoch(bigint after theBigInt(...)bridge at the function boundary). apply_decay_batchis the documented batch path; we do not unwind its internals.- All 4 handler functions are synchronous (return raw data, no
Promise). Matchessrc/tools/health.ts:226posture.
9. Type matrix (TS exports)
export interface ReputationGetInput {
readonly node_id: string;
readonly current_epoch: number;
readonly domain?: Domain;
}
export type ReputationGetOutput =
| { readonly row: ReputationRow | null }
| { readonly rows: ReputationRow[] };
export interface ReputationHistoryInput {
readonly node_id: string;
readonly domain: Domain;
readonly limit?: number;
readonly offset?: number;
}
export interface ReputationHistoryOutput {
readonly events: ReputationHistoryRow[];
}
export interface ReputationLeaderboardInput {
readonly domain: Domain;
readonly current_epoch: number;
readonly limit?: number;
}
export interface ReputationLeaderboardOutput {
readonly rows: ReputationRow[];
}
export interface ReputationCheckGatesInput {
readonly node_id: string;
readonly current_epoch: number;
}
export interface ReputationCheckGatesOutput {
readonly can_arbitrate: boolean;
readonly can_govern: boolean;
readonly max_parallel_tasks: number;
readonly rate_limit_bonus_factor: number;
readonly effective_stake_bps: number;
}
10. Acceptance criteria (executor self-check)
| # | Criterion | Verification |
|---|---|---|
| AC-1 | 4 tools registered via ε at boot | unit test asserts ctx._registeredToolNames.has('reputation_get'/...) after registerReputationTools |
| AC-2 | Zod schemas reject bad inputs | direct schema .safeParse tests on bad domain / negative offset / oversized limit / missing node_id / negative current_epoch |
| AC-3 | Decay applied lazily on every score-returning read | integration test: insert row at epoch 100 score 8000 (execution) → read at 100 score=8000 → read at 196 score < 8000 (matches decay(8000n, 500n, 96n)) |
| AC-4 | History DESC ordering correct | insert 100 events; read default → first event has highest epoch |
| AC-5 | Limit clamp (max enforced) | limit=10000 rejected by Zod (reputation_history.limit > 500 AND reputation_leaderboard.limit > 1000) |
| AC-6 | Leaderboard reflects decayed score order | insert nodes with different last_activity_epoch; verify ordering flips correctly after decay |
| AC-7 | reputation_check_gates composes P2.4.1 derivations |
hand-craft scenario (rep_arb=5000, rep_exec=3000, ban_until_epoch=null) → all 5 outputs match the P2.4.1 contract derivations |
| AC-8 | Read-only — no DB mutation | row + history counts unchanged after every tool invocation |
| AC-9 | grep — tools.ts contains no INSERT/UPDATE/DELETE SQL |
source self-scan in test |
| AC-10 | npm run build && npm run lint && npm test green |
gate before PR |
11. Phase posture
After this PR merges, λ axis graduates from colibri_code: none to colibri_code: partial per ADR-006 (Phase 2 of multi-phase spec — exactly the same posture κ had post-R87). Greek concepts shipping code: 10/15 (α β γ δ ε ζ η κ λ ν). Frontmatter graduation for docs/3-world/social/reputation.md is a separate hygiene PR not in scope here (matches κ’s R85 staging).