P0.5.2 — δ Fallback Chain — Contract
Step 2 of the 5-step chain. Behavioral contract, type shapes, invariants, error taxonomy, and AC → test mapping. Gates Step 3 (packet).
1. Types
import type { ModelId, ScoreContext } from './scoring.js';
import type { CompletionResult, AnthropicTool } from '../integrations/claude.js';
/**
* Options accepted by `routeRequest`.
*
* Phase 0: honours `maxTokens`, `systemPrompt`, `model`, `apiKey`, the
* injection seams (`completionFn`, `fetchFn`, `logger`, `delayFn`), and
* `ScoreContext` fields. Phase 1.5 will start reading more of the
* scoring factors.
*/
export interface RouteOptions extends ScoreContext {
/** Override max tokens; passed through to the adapter. */
readonly maxTokens?: number;
/** Optional system prompt. */
readonly systemPrompt?: string;
/** Override the specific model name (rarely useful in Phase 0). */
readonly model?: string;
/** Override the API key (mostly for tests). */
readonly apiKey?: string;
/** Tool descriptors; when provided, calls `createCompletionWithTools`. */
readonly tools?: ReadonlyArray<AnthropicTool>;
/**
* Inject a custom `createCompletion` (Phase 0) / `createCompletionWithTools`
* function. Defaults to the real wrapper from integrations/claude.ts.
* Tests MUST inject a mock to avoid real network calls.
*/
readonly completionFn?: CompletionFn;
/** Inject a custom scoring function (tests / diagnostics). */
readonly scoringFn?: ScoringFn;
/** Inject a custom `fetch` (passed through to the adapter). */
readonly fetchFn?: typeof fetch;
/** Inject a custom logger (passed through to the adapter). */
readonly logger?: (...args: unknown[]) => void;
/** Inject a custom delay function (passed through to the adapter). */
readonly delayFn?: (ms: number) => Promise<void>;
}
/** Pluggable completion callable. Matches `createCompletion` / `createCompletionWithTools`. */
export type CompletionFn = (
prompt: string,
options: {
readonly model?: string;
readonly maxTokens?: number;
readonly systemPrompt?: string;
readonly apiKey?: string;
readonly fetchFn?: typeof fetch;
readonly logger?: (...args: unknown[]) => void;
readonly delayFn?: (ms: number) => Promise<void>;
},
) => Promise<CompletionResult>;
/** Pluggable scoring callable. Matches `scoreIntent`. */
export type ScoringFn = (
prompt: string,
context?: ScoreContext,
) => { readonly winner: ModelId; readonly scores: Readonly<Record<ModelId, number>> };
/**
* Result returned by `routeRequest` on success.
*
* Shape is minimal — Phase 1.5 can append fields (e.g. `costUsd`,
* `modelsAttempted`, `fallbackDepth`) without breaking existing callers.
*/
export interface RouteResult {
readonly model: ModelId;
readonly content: string;
readonly finishReason: string;
readonly promptTokens: number;
readonly completionTokens: number;
readonly latencyMs: number;
}
/**
* One attempted upstream call. Phase 0 always has exactly 1 in the
* `attempts` array on failure; Phase 1.5 grows it to N.
*/
export interface FallbackAttempt {
readonly model: ModelId;
readonly error: Error;
}
/**
* Raised when every member of the fallback chain failed.
*
* Phase 0: `attempts.length === 1` always (chain has one member — Claude).
* Phase 1.5: `attempts.length` grows to match the number of models tried.
*
* The name intentionally reflects Phase 1.5 semantics; a single-member
* "chain exhaustion" is just "the only call failed", but the class name
* stays stable across the Phase 0 → 1.5 boundary so callers do not need
* to change their `instanceof` checks.
*/
export class FallbackChainExhaustedError extends Error {
readonly code = 'FALLBACK_CHAIN_EXHAUSTED' as const;
readonly attempts: ReadonlyArray<FallbackAttempt>;
readonly cause: Error | undefined;
constructor(message: string, attempts: ReadonlyArray<FallbackAttempt>) { /* ... */ }
}
/** Phase 0 compile-time + runtime marker — asserts ADR-005 §Decision. */
export const ROUTER_PHASE_0_SHAPE: {
readonly members: 1;
readonly hasCircuitBreaker: false;
readonly modelsSupported: readonly ['claude'];
};
Rationale
RouteOptions extends ScoreContext: arbitrary scoring-factor fields can be pre-populated today without a type error; Phase 1.5 starts consuming them invisibly.CompletionFnas a seam: tests inject a stub. Pattern parallelsfetchFninclaude.tsandscoringFnhere — the module is purely configurable.RouteResultfieldsmodel,content,finishReason,promptTokens,completionTokens,latencyMs: minimal projection ofCompletionResultwithmodelnarrowed toModelId. Callers that need more fields can pass throughrawCompletionin Phase 1.5 (additive).FallbackAttempt.errorisErrornotunknown: callers caninstanceof-check without a narrow. Phase 0 stores eitherAnthropicApiError,AnthropicConfigError, or (exotically) a plainErrorif the mock throws one.ROUTER_PHASE_0_SHAPE: a single-line marker whose shape’s literals are the ADR-005 invariants. A test assertsexpect(ROUTER_PHASE_0_SHAPE.members).toBe(1)— if Phase 1.5 ships without updating this marker, the test fails. Graduation requires both: (a) updating the impl, and (b) updating the marker.
2. Function signature
/**
* Route a prompt to a model and return the completion.
*
* Phase 0: single-member chain (Claude only). Picks winner via
* `scoreIntent` (always returns 'claude'), delegates to `createCompletion`,
* and throws `FallbackChainExhaustedError` on failure.
*/
export async function routeRequest(
prompt: string,
options?: RouteOptions,
): Promise<RouteResult>;
3. Invariants (ADR-005 §Decision)
| # | Invariant | Enforcement |
|---|---|---|
| I1 | Exactly one upstream call per routeRequest invocation, regardless of success or failure. |
Single await completionFn() in the impl; no retry loop around it. |
| I2 | Winner is always 'claude' in Phase 0. |
Delegates to scoreIntent, whose P0.5.1 invariant guarantees 'claude'. |
| I3 | On completionFn success, returns RouteResult with model === 'claude'. |
Impl writes 'claude' explicitly (not from upstream result.model, because upstream model name may be e.g. 'claude-sonnet-4-5' — a specific version, not the abstract router ID). |
| I4 | On completionFn error, throws FallbackChainExhaustedError. |
try/catch wraps the upstream call; every caught error becomes a FallbackChainExhaustedError with a 1-entry attempts array. |
| I5 | err.attempts.length === 1 on every Phase 0 failure. |
Single-member chain has exactly one attempt. Test asserts this explicitly. |
| I6 | err.attempts[0].model === 'claude'. |
Only candidate in Phase 0. |
| I7 | err.attempts[0].error preserves the original error (both AnthropicApiError and AnthropicConfigError). |
err.attempts[0].error is the exact Error instance thrown by completionFn. Not re-wrapped. |
| I8 | No setTimeout / setInterval / Date.now()-based circuit breaker state. |
Impl uses no timers, no module-level state. Pure function of (prompt, options) + the upstream call. |
| I9 | No MCP tool registered. | Impl exports only functions/classes/types; src/server.ts is not imported. |
| I10 | No env var reads. | Impl reads no process.env directly; ANTHROPIC_API_KEY flows through createCompletion inside the adapter. |
| I11 | ROUTER_PHASE_0_SHAPE.members === 1 and .hasCircuitBreaker === false. |
Literal-type frozen const. |
| I12 | Deterministic routing: two calls with identical prompt (and any context) route to the same model. |
Transitively true because scoreIntent is constant. |
| I13 | Tools passthrough: when options.tools is provided and non-empty, the call routes to createCompletionWithTools instead of createCompletion. |
routeRequest selects the upstream fn based on options.tools; default injection is a dispatcher that picks the correct wrapper. |
4. Error taxonomy
| Error raised by | Propagates to caller as | attempts[0].error |
|---|---|---|
AnthropicApiError (429 / 5xx exhausted, 4xx terminal, network) |
FallbackChainExhaustedError |
the same AnthropicApiError instance |
AnthropicConfigError (missing API key) |
FallbackChainExhaustedError |
the same AnthropicConfigError instance |
Mock/stub throws generic Error |
FallbackChainExhaustedError |
the generic Error instance |
Thrown non-Error value (e.g. throw 'oops') |
FallbackChainExhaustedError |
a synthesized Error with message String(value) |
FallbackChainExhaustedError.message format:
"δ fallback chain exhausted after 1 attempt: [claude] <original-error-message>"
Rationale: single line, human-readable, includes the single attempted model name and the original error’s message. Phase 1.5 will list all N attempts in the message.
5. Invocation paths
Path A — no tools (standard completion)
- Caller:
await routeRequest("Say hello.", { maxTokens: 256 }) - Router:
scoreIntent(prompt, context)→{ winner: 'claude', ... } - Router:
completionFn("Say hello.", { maxTokens: 256, ...passthrough })→CompletionResult - Router: project
CompletionResult→RouteResult { model: 'claude', content, ... } - Return.
Path B — with tools (tool-use completion)
- Caller:
await routeRequest("What's the weather?", { tools: [...] }) - Router:
scoreIntent(...)→{ winner: 'claude', ... } - Router:
completionFndispatches tocreateCompletionWithToolsinternally (default injection honors this; test injection receives the tools array in options passthrough). - Router: project →
RouteResult. - Return.
Path C — failure (Claude terminal error or exhausted retries)
- Caller:
await routeRequest("...") - Router:
scoreIntent(...)→'claude' - Router:
completionFn(...)throwsAnthropicApiError - Router: catch, wrap in
FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <api-error> }] } - Throw — caller catches.
Path D — failure (missing API key)
- Caller:
await routeRequest("...")(noapiKeyin options,process.env.ANTHROPIC_API_KEYunset) - Router:
scoreIntent(...)→'claude' - Router:
completionFn(...)throwsAnthropicConfigError - Router: catch, wrap in
FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <config-error> }] } - Throw.
No Path E exists — no cascade to another model. Single-member invariant.
6. Acceptance criteria → test mapping
| AC | Test name (in fallback.test.ts) |
Assertion |
|---|---|---|
| AC1 — happy path returns RouteResult | returns RouteResult with content and model='claude' on success |
expect(result.model).toBe('claude'); expect(result.content).toBe('Hello!') |
| AC2 — delegates to scoring | consults scoreIntent before calling upstream |
scoring spy called once |
| AC3 — delegates to upstream | calls completionFn exactly once on success |
completionFn spy called 1× |
| AC4 — forwards prompt | passes prompt to upstream completionFn |
completionFn arg[0] === prompt |
| AC5 — forwards maxTokens | passes maxTokens to upstream |
upstream opts .maxTokens matches |
| AC6 — forwards systemPrompt | passes systemPrompt to upstream |
upstream opts .systemPrompt matches |
| AC7 — failure wraps into FallbackChainExhaustedError | throws FallbackChainExhaustedError when upstream throws |
expect(p).rejects.toBeInstanceOf(FallbackChainExhaustedError) |
| AC8 — single attempt in chain | FallbackChainExhaustedError has exactly 1 attempt |
err.attempts.length === 1 |
| AC9 — attempt records claude as model | attempts[0].model === 'claude' |
ditto |
| AC10 — preserves AnthropicApiError | preserves AnthropicApiError in attempts[0].error |
instanceof AnthropicApiError |
| AC11 — preserves AnthropicConfigError | preserves AnthropicConfigError in attempts[0].error |
instanceof AnthropicConfigError |
| AC12 — ZERO cascade on failure | does not retry after upstream failure |
completionFn spy called exactly 1× even after throw |
| AC13 — determinism | two identical calls route to same model |
both results .model === 'claude' |
| AC14 — ROUTER_PHASE_0_SHAPE marker | ROUTER_PHASE_0_SHAPE asserts ADR-005 invariants |
members === 1, hasCircuitBreaker === false, modelsSupported contains only 'claude' |
| AC15 — tools passthrough | passes tools array through to completionFn |
upstream opts .tools matches |
| AC16 — FallbackChainExhaustedError message | error message includes attempt count and model |
err.message contains “1 attempt” and “claude” |
| AC17 — non-Error throw | wraps non-Error thrown values into Error instances |
err.attempts[0].error instanceof Error |
Target: 10–15 tests (some ACs collapse into shared tests). Actual count in the packet.
7. Forward-compatibility guarantees for Phase 1.5
RouteOptionsextends an openScoreContext— new scoring factors slot in without breaking callers.ModelIdwidens additively inscoring.ts;RouteResult.modelremains assignable from the widened union.FallbackChainExhaustedError.attemptsgrows in length, not shape.ROUTER_PHASE_0_SHAPEtype literals change to reflect Phase 1.5 (members: N,hasCircuitBreaker: true). This is an intentional breaking change of the marker const — it signals “Phase 0 is over.”completionFn,scoringFn,fetchFn,logger,delayFnremain as injection seams. Phase 1.5 adapters (Kimi, Codex, OpenAI) attach to a separateadapters/directory and plug into the fallback chain through the same seam.
8. Out-of-scope (explicitly)
- MCP tool registration (
router_call,router_fallback) — Phase 1.5. - Multi-model adapters — Phase 1.5.
- Circuit breaker — Phase 1.5.
- Cost / latency tracking — Phase 1.5.
- Cross-call state (sessions, rate-limit tracking, per-model availability) — Phase 1.5.
- Scoring-driven ordering — trivial in Phase 0 (1 model).
9. Conclusion
The contract above encodes ADR-005 §Decision invariants as types + test assertions. Step 3 (packet) lays out the three-file implementation plan and the 10–15 tests. Step 4 implements; Step 5 verifies.