P0.5.2 — δ Fallback Chain — Contract

Step 2 of the 5-step chain. Behavioral contract, type shapes, invariants, error taxonomy, and AC → test mapping. Gates Step 3 (packet).

1. Types

import type { ModelId, ScoreContext } from './scoring.js';
import type { CompletionResult, AnthropicTool } from '../integrations/claude.js';

/**
 * Options accepted by `routeRequest`.
 *
 * Phase 0: honours `maxTokens`, `systemPrompt`, `model`, `apiKey`, the
 * injection seams (`completionFn`, `fetchFn`, `logger`, `delayFn`), and
 * `ScoreContext` fields. Phase 1.5 will start reading more of the
 * scoring factors.
 */
export interface RouteOptions extends ScoreContext {
  /** Override max tokens; passed through to the adapter. */
  readonly maxTokens?: number;
  /** Optional system prompt. */
  readonly systemPrompt?: string;
  /** Override the specific model name (rarely useful in Phase 0). */
  readonly model?: string;
  /** Override the API key (mostly for tests). */
  readonly apiKey?: string;
  /** Tool descriptors; when provided, calls `createCompletionWithTools`. */
  readonly tools?: ReadonlyArray<AnthropicTool>;
  /**
   * Inject a custom `createCompletion` (Phase 0) / `createCompletionWithTools`
   * function. Defaults to the real wrapper from integrations/claude.ts.
   * Tests MUST inject a mock to avoid real network calls.
   */
  readonly completionFn?: CompletionFn;
  /** Inject a custom scoring function (tests / diagnostics). */
  readonly scoringFn?: ScoringFn;
  /** Inject a custom `fetch` (passed through to the adapter). */
  readonly fetchFn?: typeof fetch;
  /** Inject a custom logger (passed through to the adapter). */
  readonly logger?: (...args: unknown[]) => void;
  /** Inject a custom delay function (passed through to the adapter). */
  readonly delayFn?: (ms: number) => Promise<void>;
}

/** Pluggable completion callable. Matches `createCompletion` / `createCompletionWithTools`. */
export type CompletionFn = (
  prompt: string,
  options: {
    readonly model?: string;
    readonly maxTokens?: number;
    readonly systemPrompt?: string;
    readonly apiKey?: string;
    readonly fetchFn?: typeof fetch;
    readonly logger?: (...args: unknown[]) => void;
    readonly delayFn?: (ms: number) => Promise<void>;
  },
) => Promise<CompletionResult>;

/** Pluggable scoring callable. Matches `scoreIntent`. */
export type ScoringFn = (
  prompt: string,
  context?: ScoreContext,
) => { readonly winner: ModelId; readonly scores: Readonly<Record<ModelId, number>> };

/**
 * Result returned by `routeRequest` on success.
 *
 * Shape is minimal — Phase 1.5 can append fields (e.g. `costUsd`,
 * `modelsAttempted`, `fallbackDepth`) without breaking existing callers.
 */
export interface RouteResult {
  readonly model: ModelId;
  readonly content: string;
  readonly finishReason: string;
  readonly promptTokens: number;
  readonly completionTokens: number;
  readonly latencyMs: number;
}

/**
 * One attempted upstream call. Phase 0 always has exactly 1 in the
 * `attempts` array on failure; Phase 1.5 grows it to N.
 */
export interface FallbackAttempt {
  readonly model: ModelId;
  readonly error: Error;
}

/**
 * Raised when every member of the fallback chain failed.
 *
 * Phase 0: `attempts.length === 1` always (chain has one member — Claude).
 * Phase 1.5: `attempts.length` grows to match the number of models tried.
 *
 * The name intentionally reflects Phase 1.5 semantics; a single-member
 * "chain exhaustion" is just "the only call failed", but the class name
 * stays stable across the Phase 0 → 1.5 boundary so callers do not need
 * to change their `instanceof` checks.
 */
export class FallbackChainExhaustedError extends Error {
  readonly code = 'FALLBACK_CHAIN_EXHAUSTED' as const;
  readonly attempts: ReadonlyArray<FallbackAttempt>;
  readonly cause: Error | undefined;

  constructor(message: string, attempts: ReadonlyArray<FallbackAttempt>) { /* ... */ }
}

/** Phase 0 compile-time + runtime marker — asserts ADR-005 §Decision. */
export const ROUTER_PHASE_0_SHAPE: {
  readonly members: 1;
  readonly hasCircuitBreaker: false;
  readonly modelsSupported: readonly ['claude'];
};

Rationale

RouteOptions extends ScoreContext: arbitrary scoring-factor fields can be pre-populated today without a type error; Phase 1.5 starts consuming them invisibly.
CompletionFn as a seam: tests inject a stub. Pattern parallels fetchFn in claude.ts and scoringFn here — the module is purely configurable.
RouteResult fields model, content, finishReason, promptTokens, completionTokens, latencyMs: minimal projection of CompletionResult with model narrowed to ModelId. Callers that need more fields can pass through rawCompletion in Phase 1.5 (additive).
FallbackAttempt.error is Error not unknown: callers can instanceof-check without a narrow. Phase 0 stores either AnthropicApiError, AnthropicConfigError, or (exotically) a plain Error if the mock throws one.
ROUTER_PHASE_0_SHAPE: a single-line marker whose shape’s literals are the ADR-005 invariants. A test asserts expect(ROUTER_PHASE_0_SHAPE.members).toBe(1) — if Phase 1.5 ships without updating this marker, the test fails. Graduation requires both: (a) updating the impl, and (b) updating the marker.

2. Function signature

/**
 * Route a prompt to a model and return the completion.
 *
 * Phase 0: single-member chain (Claude only). Picks winner via
 * `scoreIntent` (always returns 'claude'), delegates to `createCompletion`,
 * and throws `FallbackChainExhaustedError` on failure.
 */
export async function routeRequest(
  prompt: string,
  options?: RouteOptions,
): Promise<RouteResult>;

3. Invariants (ADR-005 §Decision)

#	Invariant	Enforcement
I1	Exactly one upstream call per `routeRequest` invocation, regardless of success or failure.	Single `await completionFn()` in the impl; no retry loop around it.
I2	Winner is always `'claude'` in Phase 0.	Delegates to `scoreIntent`, whose P0.5.1 invariant guarantees `'claude'`.
I3	On `completionFn` success, returns `RouteResult` with `model === 'claude'`.	Impl writes `'claude'` explicitly (not from upstream `result.model`, because upstream model name may be e.g. `'claude-sonnet-4-5'` — a specific version, not the abstract router ID).
I4	On `completionFn` error, throws `FallbackChainExhaustedError`.	`try/catch` wraps the upstream call; every caught error becomes a `FallbackChainExhaustedError` with a 1-entry `attempts` array.
I5	`err.attempts.length === 1` on every Phase 0 failure.	Single-member chain has exactly one attempt. Test asserts this explicitly.
I6	`err.attempts[0].model === 'claude'`.	Only candidate in Phase 0.
I7	`err.attempts[0].error` preserves the original error (both `AnthropicApiError` and `AnthropicConfigError`).	`err.attempts[0].error` is the exact `Error` instance thrown by `completionFn`. Not re-wrapped.
I8	No `setTimeout` / `setInterval` / `Date.now()`-based circuit breaker state.	Impl uses no timers, no module-level state. Pure function of `(prompt, options)` + the upstream call.
I9	No MCP tool registered.	Impl exports only functions/classes/types; `src/server.ts` is not imported.
I10	No env var reads.	Impl reads no `process.env` directly; `ANTHROPIC_API_KEY` flows through `createCompletion` inside the adapter.
I11	`ROUTER_PHASE_0_SHAPE.members === 1` and `.hasCircuitBreaker === false`.	Literal-type frozen const.
I12	Deterministic routing: two calls with identical `prompt` (and any context) route to the same model.	Transitively true because `scoreIntent` is constant.
I13	Tools passthrough: when `options.tools` is provided and non-empty, the call routes to `createCompletionWithTools` instead of `createCompletion`.	`routeRequest` selects the upstream fn based on `options.tools`; default injection is a dispatcher that picks the correct wrapper.

4. Error taxonomy

Error raised by	Propagates to caller as	`attempts[0].error`
`AnthropicApiError` (429 / 5xx exhausted, 4xx terminal, network)	`FallbackChainExhaustedError`	the same `AnthropicApiError` instance
`AnthropicConfigError` (missing API key)	`FallbackChainExhaustedError`	the same `AnthropicConfigError` instance
Mock/stub throws generic `Error`	`FallbackChainExhaustedError`	the generic `Error` instance
Thrown non-Error value (e.g. `throw 'oops'`)	`FallbackChainExhaustedError`	a synthesized `Error` with message `String(value)`

FallbackChainExhaustedError.message format:

"δ fallback chain exhausted after 1 attempt: [claude] <original-error-message>"

Rationale: single line, human-readable, includes the single attempted model name and the original error’s message. Phase 1.5 will list all N attempts in the message.

5. Invocation paths

Path A — no tools (standard completion)

Caller: await routeRequest("Say hello.", { maxTokens: 256 })
Router: scoreIntent(prompt, context) → { winner: 'claude', ... }
Router: completionFn("Say hello.", { maxTokens: 256, ...passthrough }) → CompletionResult
Router: project CompletionResult → RouteResult { model: 'claude', content, ... }
Return.

Path B — with tools (tool-use completion)

Caller: await routeRequest("What's the weather?", { tools: [...] })
Router: scoreIntent(...) → { winner: 'claude', ... }
Router: completionFn dispatches to createCompletionWithTools internally (default injection honors this; test injection receives the tools array in options passthrough).
Router: project → RouteResult.
Return.

Path C — failure (Claude terminal error or exhausted retries)

Caller: await routeRequest("...")
Router: scoreIntent(...) → 'claude'
Router: completionFn(...) throws AnthropicApiError
Router: catch, wrap in FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <api-error> }] }
Throw — caller catches.

Path D — failure (missing API key)

Caller: await routeRequest("...") (no apiKey in options, process.env.ANTHROPIC_API_KEY unset)
Router: scoreIntent(...) → 'claude'
Router: completionFn(...) throws AnthropicConfigError
Router: catch, wrap in FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <config-error> }] }
Throw.

No Path E exists — no cascade to another model. Single-member invariant.

6. Acceptance criteria → test mapping

AC	Test name (in `fallback.test.ts`)	Assertion
AC1 — happy path returns RouteResult	`returns RouteResult with content and model='claude' on success`	`expect(result.model).toBe('claude'); expect(result.content).toBe('Hello!')`
AC2 — delegates to scoring	`consults scoreIntent before calling upstream`	scoring spy called once
AC3 — delegates to upstream	`calls completionFn exactly once on success`	`completionFn` spy called 1×
AC4 — forwards prompt	`passes prompt to upstream completionFn`	`completionFn` arg[0] === prompt
AC5 — forwards maxTokens	`passes maxTokens to upstream`	upstream opts `.maxTokens` matches
AC6 — forwards systemPrompt	`passes systemPrompt to upstream`	upstream opts `.systemPrompt` matches
AC7 — failure wraps into FallbackChainExhaustedError	`throws FallbackChainExhaustedError when upstream throws`	`expect(p).rejects.toBeInstanceOf(FallbackChainExhaustedError)`
AC8 — single attempt in chain	`FallbackChainExhaustedError has exactly 1 attempt`	`err.attempts.length === 1`
AC9 — attempt records claude as model	`attempts[0].model === 'claude'`	ditto
AC10 — preserves AnthropicApiError	`preserves AnthropicApiError in attempts[0].error`	`instanceof AnthropicApiError`
AC11 — preserves AnthropicConfigError	`preserves AnthropicConfigError in attempts[0].error`	`instanceof AnthropicConfigError`
AC12 — ZERO cascade on failure	`does not retry after upstream failure`	`completionFn` spy called exactly 1× even after throw
AC13 — determinism	`two identical calls route to same model`	both results `.model === 'claude'`
AC14 — ROUTER_PHASE_0_SHAPE marker	`ROUTER_PHASE_0_SHAPE asserts ADR-005 invariants`	`members === 1`, `hasCircuitBreaker === false`, `modelsSupported` contains only `'claude'`
AC15 — tools passthrough	`passes tools array through to completionFn`	upstream opts `.tools` matches
AC16 — FallbackChainExhaustedError message	`error message includes attempt count and model`	`err.message` contains “1 attempt” and “claude”
AC17 — non-Error throw	`wraps non-Error thrown values into Error instances`	`err.attempts[0].error instanceof Error`

Target: 10–15 tests (some ACs collapse into shared tests). Actual count in the packet.

7. Forward-compatibility guarantees for Phase 1.5

RouteOptions extends an open ScoreContext — new scoring factors slot in without breaking callers.
ModelId widens additively in scoring.ts; RouteResult.model remains assignable from the widened union.
FallbackChainExhaustedError.attempts grows in length, not shape.
ROUTER_PHASE_0_SHAPE type literals change to reflect Phase 1.5 (members: N, hasCircuitBreaker: true). This is an intentional breaking change of the marker const — it signals “Phase 0 is over.”
completionFn, scoringFn, fetchFn, logger, delayFn remain as injection seams. Phase 1.5 adapters (Kimi, Codex, OpenAI) attach to a separate adapters/ directory and plug into the fallback chain through the same seam.

8. Out-of-scope (explicitly)

MCP tool registration (router_call, router_fallback) — Phase 1.5.
Multi-model adapters — Phase 1.5.
Circuit breaker — Phase 1.5.
Cost / latency tracking — Phase 1.5.
Cross-call state (sessions, rate-limit tracking, per-model availability) — Phase 1.5.
Scoring-driven ordering — trivial in Phase 0 (1 model).

9. Conclusion

The contract above encodes ADR-005 §Decision invariants as types + test assertions. Step 3 (packet) lays out the three-file implementation plan and the 10–15 tests. Step 4 implements; Step 5 verifies.