P1.5.7 — `router_*` MCP Tools Execution Packet

1. File plan

1.1. Create `src/domains/router/tools.ts`

Structure (mirrors src/domains/consensus/tools.ts):

§1. JSDoc header
§2. Imports
§3. Constants (ModelId enum tuple)
§4. Zod input schemas (4)
§5. Zod output schemas (3 — router_call omits output schema per contract §3.2)
§6. Public types (ts inferences)
§7. Handlers (4 — pure functions of input)
§8. registerRouterTools(ctx)

Estimated size: ~350 LOC including JSDoc.

1.2. Create `src/tests/domains/router/tools.test.ts`

Structure (mirrors src/__tests__/domains/consensus/tools.test.ts):

§1. Imports + fixtures
§2. router_score schema + handler tests
§3. router_call schema rejection tests + direct handler tests with mocked options
§4. router_fallback schema + handler tests
§5. router_stats schema + handler tests
§6. registerRouterTools (count + duplicate guard)

Estimated size: ~400 LOC (28-30 tests).

1.3. Modify `src/server.ts`

Two edits:

Edit 1 — line 50 area (import block):

+ import { registerRouterTools } from './domains/router/tools.js';

Edit 2 — line 593 area (after registerConsensusTools(ctx);):

+    // P1.5.7: register δ Router MCP tools — router_score, router_call,
+    // router_fallback, router_stats. Phase 1.5 W6 graduation: first δ
+    // MCP surface (Phase 0 P0.5.1/P0.5.2 shipped library-only stubs per
+    // ADR-005). Closes ADR-004 R75 Wave H tool-surface amendment for δ.
+    // Tool count moves from 23 → 27 (Phase 0: 14, λ R89A: 4, θ R89B: 5,
+    // δ Phase 1.5: 4).
+    registerRouterTools(ctx);

1.4. Modify `src/domains/router/index.ts`

One edit — append export * from './tools.js'; to the barrel.

2. Implementation walk

2.1. router_score handler

export function routerScore(input: RouterScoreInput): RouterScoreOutput {
  const scoringResult = scoreIntent(input.prompt, input.context ?? {});
  const ruleVersionHash = computeScoringRuleVersionHash();
  return Object.freeze({
    scores: scoringResult.scores,
    winner: scoringResult.winner,
    rule_version_hash: ruleVersionHash,
  });
}

Notes:

scoreIntent’s output is already frozen; we wrap with Object.freeze for the additional rule_version_hash field.
Passes through to the Phase 0 fallback constant when candidatesSnapshot is absent.

2.2. router_call handler

export async function routerCall(
  input: RouterCallInput,
): Promise<RouteResult> {
  // Project the validated MCP input to RouteOptions. The validated shape
  // is already a structural subset; this projection just renames keys
  // and drops `undefined` fields.
  const options: RouteOptions = {};
  if (input.options) {
    if (input.options.maxTokens !== undefined) options.maxTokens = input.options.maxTokens;
    if (input.options.systemPrompt !== undefined) options.systemPrompt = input.options.systemPrompt;
    if (input.options.model !== undefined) options.model = input.options.model;
    if (input.options.task !== undefined) options.task = input.options.task;
    if (input.options.operatorPreference !== undefined) options.operatorPreference = input.options.operatorPreference;
  }
  return await routeRequest(input.prompt, options);
}

Notes:

Does NOT pass through apiKey, completionFn, etc. — those keys are not in the Zod-validated input.options shape.
routeRequest returns a frozen RouteResult; we pass through unchanged.

2.3. router_fallback handler

export function routerFallback(input: RouterFallbackInput): RouterFallbackOutput {
  if (input.reset === true) {
    if (input.model_id !== undefined) {
      resetCircuitBreaker(input.model_id);
    } else {
      resetCircuitBreaker();
    }
  }
  const stateMap = getCircuitBreakerState();
  const circuitState: Record<string, CircuitState> = {};
  for (const [modelId, state] of stateMap) {
    circuitState[modelId] = state;
  }
  return Object.freeze({ circuitState: Object.freeze(circuitState) });
}

Notes:

The reset happens BEFORE the snapshot read — caller sees post-reset state.
Object.fromEntries would also work; explicit loop matches the codebase style and is type-narrower (avoid implicit any).

2.4. router_stats handler

export function routerStats(_input: RouterStatsInput): RouterStatsOutput {
  return getRouterStats();
}

Notes:

getRouterStats() already returns the exact wire shape; pass through.
Input ignored (empty object).

3. Zod schemas — full text

3.1. ModelId enum schema

const MODEL_IDS = [
  'claude', 'claude-sonnet-3-5', 'claude-haiku-3-5',
  'gpt-4o', 'gpt-4o-mini', 'gemini-1-5-pro',
  'llama-3-3-70b', 'mixtral-8x22b', 'kimi-k2',
] as const;

const ModelIdSchema = z.enum(MODEL_IDS);

3.2. TaskShape schema

const TaskShapeSchema = z.object({
  domain: z.string().optional(),
  tokens: z.number().int().nonnegative().optional(),
  deadline_ms: z.number().int().nonnegative().optional(),
  skill: z.array(z.string()).optional(),
}).strict();

3.3. router_score schemas

export const RouterScoreInputSchema = z.object({
  prompt: z.string().min(1),
  context: z.object({
    task: TaskShapeSchema.optional(),
    operatorPreference: z.record(z.string(), z.number().min(0).max(1)).optional(),
  }).strict().optional(),
}).strict();

export const RouterScoreOutputSchema = z.object({
  scores: z.record(z.string(), z.number()),
  winner: ModelIdSchema,
  rule_version_hash: z.string().regex(/^[0-9a-f]{64}$/),
}).strict();

3.4. router_call schemas

export const RouterCallInputSchema = z.object({
  prompt: z.string().min(1),
  options: z.object({
    maxTokens: z.number().int().positive().optional(),
    systemPrompt: z.string().optional(),
    model: z.string().optional(),
    task: TaskShapeSchema.optional(),
    operatorPreference: z.record(z.string(), z.number().min(0).max(1)).optional(),
  }).strict().optional(),
}).strict();

// Output schema OMITTED per contract §3.2 — RouteResult.content shape is variable.

3.5. router_fallback schemas

const CircuitStateSchema = z.object({
  failures: z.number().int().nonnegative(),
  openedAt: z.number().nullable(),
}).strict();

export const RouterFallbackInputSchema = z.object({
  model_id: ModelIdSchema.optional(),
  reset: z.boolean().optional(),
}).strict();

export const RouterFallbackOutputSchema = z.object({
  circuitState: z.record(z.string(), CircuitStateSchema),
}).strict();

3.6. router_stats schemas

export const RouterStatsInputSchema = z.object({}).strict();

const RouterStatsRowSchema = z.object({
  calls_total: z.number().int().nonnegative(),
  successes: z.number().int().nonnegative(),
  failures: z.number().int().nonnegative(),
  avg_cost_usd: z.number().nonnegative(),
  p50_latency_ms: z.number().nonnegative(),
  success_rate: z.number().min(0).max(1),
}).strict();

export const RouterStatsOutputSchema = z.object({
  models: z.record(z.string(), RouterStatsRowSchema),
}).strict();

4. registerRouterTools function

export function registerRouterTools(ctx: ColibriServerContext): void {
  registerColibriTool(
    ctx,
    'router_score',
    {
      title: 'router_score',
      description:
        'Score a prompt across the 9-member δ candidate cohort and return the winning ModelId. Returns the full {scores, winner, rule_version_hash} triple. The rule_version_hash anchors the κ rule pack the scoring formula consulted, so callers can prove which version of policy produced this routing decision.',
      inputSchema: RouterScoreInputSchema,
      outputSchema: RouterScoreOutputSchema,
    },
    (input): RouterScoreOutput => routerScore(input),
  );

  registerColibriTool(
    ctx,
    'router_call',
    {
      title: 'router_call',
      description:
        'Route a prompt through the δ N-member fallback chain (P1.5.5) and return the upstream completion. The chain walks scoreIntent-ranked candidates, skips models whose circuit breakers are open, and races each attempt against COLIBRI_MODEL_TIMEOUT. Throws via MCP HANDLER_ERROR on FallbackChainExhaustedError. apiKey + injection seams are rejected at the strict-input boundary.',
      inputSchema: RouterCallInputSchema,
      // outputSchema omitted: RouteResult.content shape varies by upstream model.
    },
    async (input): Promise<RouteResult> => routerCall(input),
  );

  registerColibriTool(
    ctx,
    'router_fallback',
    {
      title: 'router_fallback',
      description:
        'Inspect or reset the δ circuit-breaker state. With {reset:true, model_id:X}, clears one model. With {reset:true} alone, clears every model. With {reset:false} or {} alone, reads the current snapshot. Returns {circuitState: Record<ModelId, {failures, openedAt}>}.',
      inputSchema: RouterFallbackInputSchema,
      outputSchema: RouterFallbackOutputSchema,
    },
    (input): RouterFallbackOutput => routerFallback(input),
  );

  registerColibriTool(
    ctx,
    'router_stats',
    {
      title: 'router_stats',
      description:
        'Snapshot per-ModelId router aggregates: calls_total, successes, failures, avg_cost_usd, p50_latency_ms, success_rate. Models never touched are absent from the response. Resets are out of scope for this tool; use the test-only resetRouterStats() in cost.ts directly.',
      inputSchema: RouterStatsInputSchema,
      outputSchema: RouterStatsOutputSchema,
    },
    (_input): RouterStatsOutput => routerStats(_input),
  );
}

5. Test plan — 28 tests

5.1. router_score (7 tests)

Empty-candidate fallback returns {claude: 1.0, winner: 'claude'} shape.
Output includes rule_version_hash as 64 lowercase hex chars.
Same input → same output (determinism).
Schema rejects empty prompt ("").
Schema rejects extra key in input (strict mode).
Schema rejects negative task.tokens.
Output passes the output schema.

5.2. router_call (5 tests — schema-only direct-handler)

Schema accepts {prompt: "hi"} (no options).
Schema accepts valid options (subset of RouteOptions).
Schema REJECTS {prompt: "hi", options: {apiKey: "x"}} (forbidden key).
Schema REJECTS {prompt: "hi", options: {completionFn: () => null}} (forbidden key).
Schema REJECTS {prompt: "", ...} (min(1) constraint).

5.3. router_fallback (6 tests)

Empty initial state → {circuitState: {}}.
After 3 failures on gpt-4o (via recordFailure test-API) → circuitState['gpt-4o'].openedAt !== null.
{reset: true, model_id: 'gpt-4o'} clears that model’s state.
{reset: true} (no model_id) clears all models.
Read-only call (no reset) does NOT mutate state.
Schema rejects unknown model_id (e.g. 'made-up').

5.4. router_stats (5 tests)

Empty initial state → {models: {}}.
After one success record → models['claude'].calls_total === 1.
After one success + one failure → success_rate === 0.5 for that model.
Schema rejects extra keys (strict mode).
Output passes the output schema.

5.5. registerRouterTools (2 tests)

After call, ctx._registeredToolNames contains all 4 names; size is exactly 4 (on a fresh ctx).
Second call on same ctx throws.

5.6. Cross-cutting (3 tests)

All 4 handlers return frozen-at-top-level outputs.
Source file tools.ts does NOT contain apiKey, completionFn, completionFnRegistry, scoringFn, fetchFn, delayFn, nowFn in any input schema (compile-time grep test).
MODEL_IDS tuple has exactly 9 entries (matches ModelId union).

6. Build / lint / test gate

npm run build && npm run lint && npm test

Expected delta:

TypeScript build: clean (new module, no breaking edits).
Lint: clean (matches existing patterns).
Tests: +28 → 3325 + 28 = ~3353 (baseline 3325 on cf6221c9).
Server test (if it asserts global count) will need a bump 23 → 27 if such an assertion exists.

7. Pre-existing flakes (carry through, do not fix in this slice)

consensus/parity-harness G7.1 perf budget — CI-load sensitive, retry-clean
reputation/tools.test.ts parallel-migration prefix race — retry-clean
kimi.test.ts ● injection seams › 7. latency measurement — timer-imprecision under CI load (introduced by P1.5.2 W3)

Document these in the verification doc as known follow-ups.

8. Forbiddens checklist

9. Implementation order

Write src/domains/router/tools.ts (handlers + schemas + register).
Edit src/domains/router/index.ts — append re-export.
Edit src/server.ts — add import + add registerRouterTools(ctx); call in bootstrap.
Write src/__tests__/domains/router/tools.test.ts.
Run npm run build && npm run lint && npm test.
Fix any issues (especially if server.test.ts asserts a fixed tool count).
Step 5 verify doc.

10. Rollback plan

If npm run build fails: the new file is self-contained; remove the registerRouterTools import + call from src/server.ts and the build is green. The new file can stay (it’s unused without the call site) until the bug is fixed.

If npm test fails on a flaky existing test: retry once; if flake confirmed (consensus parity, reputation prefix, kimi latency timer), document as known and proceed.

If npm test fails on a new test: fix the handler / schema, do NOT skip.

P1.5.7 — router_* MCP Tools Execution Packet

1. File plan

1.1. Create src/domains/router/tools.ts

1.2. Create src/__tests__/domains/router/tools.test.ts

1.3. Modify src/server.ts

1.4. Modify src/domains/router/index.ts

2. Implementation walk

2.1. router_score handler

2.2. router_call handler

2.3. router_fallback handler

2.4. router_stats handler

3. Zod schemas — full text

3.1. ModelId enum schema

3.2. TaskShape schema

3.3. router_score schemas

3.4. router_call schemas

3.5. router_fallback schemas

3.6. router_stats schemas

4. registerRouterTools function

5. Test plan — 28 tests

5.1. router_score (7 tests)

5.2. router_call (5 tests — schema-only direct-handler)

5.3. router_fallback (6 tests)

5.4. router_stats (5 tests)

5.5. registerRouterTools (2 tests)

5.6. Cross-cutting (3 tests)

6. Build / lint / test gate

7. Pre-existing flakes (carry through, do not fix in this slice)

8. Forbiddens checklist

9. Implementation order

10. Rollback plan

P1.5.7 — `router_*` MCP Tools Execution Packet

1.1. Create `src/domains/router/tools.ts`

1.2. Create `src/tests/domains/router/tools.test.ts`

1.3. Modify `src/server.ts`

1.4. Modify `src/domains/router/index.ts`