P1.5.7 — router_* MCP Tools Execution Packet
1. File plan
1.1. Create src/domains/router/tools.ts
Structure (mirrors src/domains/consensus/tools.ts):
§1. JSDoc header
§2. Imports
§3. Constants (ModelId enum tuple)
§4. Zod input schemas (4)
§5. Zod output schemas (3 — router_call omits output schema per contract §3.2)
§6. Public types (ts inferences)
§7. Handlers (4 — pure functions of input)
§8. registerRouterTools(ctx)
Estimated size: ~350 LOC including JSDoc.
1.2. Create src/__tests__/domains/router/tools.test.ts
Structure (mirrors src/__tests__/domains/consensus/tools.test.ts):
§1. Imports + fixtures
§2. router_score schema + handler tests
§3. router_call schema rejection tests + direct handler tests with mocked options
§4. router_fallback schema + handler tests
§5. router_stats schema + handler tests
§6. registerRouterTools (count + duplicate guard)
Estimated size: ~400 LOC (28-30 tests).
1.3. Modify src/server.ts
Two edits:
Edit 1 — line 50 area (import block):
+ import { registerRouterTools } from './domains/router/tools.js';
Edit 2 — line 593 area (after registerConsensusTools(ctx);):
+ // P1.5.7: register δ Router MCP tools — router_score, router_call,
+ // router_fallback, router_stats. Phase 1.5 W6 graduation: first δ
+ // MCP surface (Phase 0 P0.5.1/P0.5.2 shipped library-only stubs per
+ // ADR-005). Closes ADR-004 R75 Wave H tool-surface amendment for δ.
+ // Tool count moves from 23 → 27 (Phase 0: 14, λ R89A: 4, θ R89B: 5,
+ // δ Phase 1.5: 4).
+ registerRouterTools(ctx);
1.4. Modify src/domains/router/index.ts
One edit — append export * from './tools.js'; to the barrel.
2. Implementation walk
2.1. router_score handler
export function routerScore(input: RouterScoreInput): RouterScoreOutput {
const scoringResult = scoreIntent(input.prompt, input.context ?? {});
const ruleVersionHash = computeScoringRuleVersionHash();
return Object.freeze({
scores: scoringResult.scores,
winner: scoringResult.winner,
rule_version_hash: ruleVersionHash,
});
}
Notes:
scoreIntent’s output is already frozen; we wrap withObject.freezefor the additionalrule_version_hashfield.- Passes through to the Phase 0 fallback constant when
candidatesSnapshotis absent.
2.2. router_call handler
export async function routerCall(
input: RouterCallInput,
): Promise<RouteResult> {
// Project the validated MCP input to RouteOptions. The validated shape
// is already a structural subset; this projection just renames keys
// and drops `undefined` fields.
const options: RouteOptions = {};
if (input.options) {
if (input.options.maxTokens !== undefined) options.maxTokens = input.options.maxTokens;
if (input.options.systemPrompt !== undefined) options.systemPrompt = input.options.systemPrompt;
if (input.options.model !== undefined) options.model = input.options.model;
if (input.options.task !== undefined) options.task = input.options.task;
if (input.options.operatorPreference !== undefined) options.operatorPreference = input.options.operatorPreference;
}
return await routeRequest(input.prompt, options);
}
Notes:
- Does NOT pass through
apiKey,completionFn, etc. — those keys are not in the Zod-validatedinput.optionsshape. routeRequestreturns a frozenRouteResult; we pass through unchanged.
2.3. router_fallback handler
export function routerFallback(input: RouterFallbackInput): RouterFallbackOutput {
if (input.reset === true) {
if (input.model_id !== undefined) {
resetCircuitBreaker(input.model_id);
} else {
resetCircuitBreaker();
}
}
const stateMap = getCircuitBreakerState();
const circuitState: Record<string, CircuitState> = {};
for (const [modelId, state] of stateMap) {
circuitState[modelId] = state;
}
return Object.freeze({ circuitState: Object.freeze(circuitState) });
}
Notes:
- The reset happens BEFORE the snapshot read — caller sees post-reset state.
Object.fromEntrieswould also work; explicit loop matches the codebase style and is type-narrower (avoid implicitany).
2.4. router_stats handler
export function routerStats(_input: RouterStatsInput): RouterStatsOutput {
return getRouterStats();
}
Notes:
getRouterStats()already returns the exact wire shape; pass through.- Input ignored (empty object).
3. Zod schemas — full text
3.1. ModelId enum schema
const MODEL_IDS = [
'claude', 'claude-sonnet-3-5', 'claude-haiku-3-5',
'gpt-4o', 'gpt-4o-mini', 'gemini-1-5-pro',
'llama-3-3-70b', 'mixtral-8x22b', 'kimi-k2',
] as const;
const ModelIdSchema = z.enum(MODEL_IDS);
3.2. TaskShape schema
const TaskShapeSchema = z.object({
domain: z.string().optional(),
tokens: z.number().int().nonnegative().optional(),
deadline_ms: z.number().int().nonnegative().optional(),
skill: z.array(z.string()).optional(),
}).strict();
3.3. router_score schemas
export const RouterScoreInputSchema = z.object({
prompt: z.string().min(1),
context: z.object({
task: TaskShapeSchema.optional(),
operatorPreference: z.record(z.string(), z.number().min(0).max(1)).optional(),
}).strict().optional(),
}).strict();
export const RouterScoreOutputSchema = z.object({
scores: z.record(z.string(), z.number()),
winner: ModelIdSchema,
rule_version_hash: z.string().regex(/^[0-9a-f]{64}$/),
}).strict();
3.4. router_call schemas
export const RouterCallInputSchema = z.object({
prompt: z.string().min(1),
options: z.object({
maxTokens: z.number().int().positive().optional(),
systemPrompt: z.string().optional(),
model: z.string().optional(),
task: TaskShapeSchema.optional(),
operatorPreference: z.record(z.string(), z.number().min(0).max(1)).optional(),
}).strict().optional(),
}).strict();
// Output schema OMITTED per contract §3.2 — RouteResult.content shape is variable.
3.5. router_fallback schemas
const CircuitStateSchema = z.object({
failures: z.number().int().nonnegative(),
openedAt: z.number().nullable(),
}).strict();
export const RouterFallbackInputSchema = z.object({
model_id: ModelIdSchema.optional(),
reset: z.boolean().optional(),
}).strict();
export const RouterFallbackOutputSchema = z.object({
circuitState: z.record(z.string(), CircuitStateSchema),
}).strict();
3.6. router_stats schemas
export const RouterStatsInputSchema = z.object({}).strict();
const RouterStatsRowSchema = z.object({
calls_total: z.number().int().nonnegative(),
successes: z.number().int().nonnegative(),
failures: z.number().int().nonnegative(),
avg_cost_usd: z.number().nonnegative(),
p50_latency_ms: z.number().nonnegative(),
success_rate: z.number().min(0).max(1),
}).strict();
export const RouterStatsOutputSchema = z.object({
models: z.record(z.string(), RouterStatsRowSchema),
}).strict();
4. registerRouterTools function
export function registerRouterTools(ctx: ColibriServerContext): void {
registerColibriTool(
ctx,
'router_score',
{
title: 'router_score',
description:
'Score a prompt across the 9-member δ candidate cohort and return the winning ModelId. Returns the full {scores, winner, rule_version_hash} triple. The rule_version_hash anchors the κ rule pack the scoring formula consulted, so callers can prove which version of policy produced this routing decision.',
inputSchema: RouterScoreInputSchema,
outputSchema: RouterScoreOutputSchema,
},
(input): RouterScoreOutput => routerScore(input),
);
registerColibriTool(
ctx,
'router_call',
{
title: 'router_call',
description:
'Route a prompt through the δ N-member fallback chain (P1.5.5) and return the upstream completion. The chain walks scoreIntent-ranked candidates, skips models whose circuit breakers are open, and races each attempt against COLIBRI_MODEL_TIMEOUT. Throws via MCP HANDLER_ERROR on FallbackChainExhaustedError. apiKey + injection seams are rejected at the strict-input boundary.',
inputSchema: RouterCallInputSchema,
// outputSchema omitted: RouteResult.content shape varies by upstream model.
},
async (input): Promise<RouteResult> => routerCall(input),
);
registerColibriTool(
ctx,
'router_fallback',
{
title: 'router_fallback',
description:
'Inspect or reset the δ circuit-breaker state. With {reset:true, model_id:X}, clears one model. With {reset:true} alone, clears every model. With {reset:false} or {} alone, reads the current snapshot. Returns {circuitState: Record<ModelId, {failures, openedAt}>}.',
inputSchema: RouterFallbackInputSchema,
outputSchema: RouterFallbackOutputSchema,
},
(input): RouterFallbackOutput => routerFallback(input),
);
registerColibriTool(
ctx,
'router_stats',
{
title: 'router_stats',
description:
'Snapshot per-ModelId router aggregates: calls_total, successes, failures, avg_cost_usd, p50_latency_ms, success_rate. Models never touched are absent from the response. Resets are out of scope for this tool; use the test-only resetRouterStats() in cost.ts directly.',
inputSchema: RouterStatsInputSchema,
outputSchema: RouterStatsOutputSchema,
},
(_input): RouterStatsOutput => routerStats(_input),
);
}
5. Test plan — 28 tests
5.1. router_score (7 tests)
- Empty-candidate fallback returns
{claude: 1.0, winner: 'claude'}shape. - Output includes
rule_version_hashas 64 lowercase hex chars. - Same input → same output (determinism).
- Schema rejects empty prompt (
""). - Schema rejects extra key in input (strict mode).
- Schema rejects negative
task.tokens. - Output passes the output schema.
5.2. router_call (5 tests — schema-only direct-handler)
- Schema accepts
{prompt: "hi"}(no options). - Schema accepts valid
options(subset ofRouteOptions). - Schema REJECTS
{prompt: "hi", options: {apiKey: "x"}}(forbidden key). - Schema REJECTS
{prompt: "hi", options: {completionFn: () => null}}(forbidden key). - Schema REJECTS
{prompt: "", ...}(min(1) constraint).
5.3. router_fallback (6 tests)
- Empty initial state →
{circuitState: {}}. - After 3 failures on
gpt-4o(viarecordFailuretest-API) →circuitState['gpt-4o'].openedAt !== null. {reset: true, model_id: 'gpt-4o'}clears that model’s state.{reset: true}(nomodel_id) clears all models.- Read-only call (no
reset) does NOT mutate state. - Schema rejects unknown model_id (e.g.
'made-up').
5.4. router_stats (5 tests)
- Empty initial state →
{models: {}}. - After one success record →
models['claude'].calls_total === 1. - After one success + one failure →
success_rate === 0.5for that model. - Schema rejects extra keys (strict mode).
- Output passes the output schema.
5.5. registerRouterTools (2 tests)
- After call,
ctx._registeredToolNamescontains all 4 names; size is exactly 4 (on a fresh ctx). - Second call on same ctx throws.
5.6. Cross-cutting (3 tests)
- All 4 handlers return frozen-at-top-level outputs.
- Source file
tools.tsdoes NOT containapiKey,completionFn,completionFnRegistry,scoringFn,fetchFn,delayFn,nowFnin any input schema (compile-time grep test). MODEL_IDStuple has exactly 9 entries (matches ModelId union).
6. Build / lint / test gate
npm run build && npm run lint && npm test
Expected delta:
- TypeScript build: clean (new module, no breaking edits).
- Lint: clean (matches existing patterns).
- Tests: +28 → 3325 + 28 = ~3353 (baseline 3325 on
cf6221c9). - Server test (if it asserts global count) will need a bump 23 → 27 if such an assertion exists.
7. Pre-existing flakes (carry through, do not fix in this slice)
consensus/parity-harness G7.1perf budget — CI-load sensitive, retry-cleanreputation/tools.test.tsparallel-migration prefix race — retry-cleankimi.test.ts ● injection seams › 7. latency measurement— timer-imprecision under CI load (introduced by P1.5.2 W3)
Document these in the verification doc as known follow-ups.
8. Forbiddens checklist
- No edits to
scoring.ts,fallback.ts,circuit.ts,cost.ts,adapters/*.ts,scoring-weights.ts. - No κ rule edits.
- No
AMS_*env vars (onlyCOLIBRI_*). - No
apiKeyor injection seams in MCP input schemas. - No internal
thought_recordemission (P1.5.10 scope). - No δ frontmatter graduation (stays
partial). - No new tools beyond the 4 named.
- No
agent_spawnor any other deferred Phase 1.5+ tool. - No interactive cmds (
git rebase -i,git add -i). - No edits to main checkout.
9. Implementation order
- Write
src/domains/router/tools.ts(handlers + schemas + register). - Edit
src/domains/router/index.ts— append re-export. - Edit
src/server.ts— add import + addregisterRouterTools(ctx);call in bootstrap. - Write
src/__tests__/domains/router/tools.test.ts. - Run
npm run build && npm run lint && npm test. - Fix any issues (especially if
server.test.tsasserts a fixed tool count). - Step 5 verify doc.
10. Rollback plan
If npm run build fails: the new file is self-contained; remove the registerRouterTools import + call from src/server.ts and the build is green. The new file can stay (it’s unused without the call site) until the bug is fixed.
If npm test fails on a flaky existing test: retry once; if flake confirmed (consensus parity, reputation prefix, kimi latency timer), document as known and proceed.
If npm test fails on a new test: fix the handler / schema, do NOT skip.