P0.5.2 — δ Fallback Chain — Contract

Step 2 of the 5-step chain. Behavioral contract, type shapes, invariants, error taxonomy, and AC → test mapping. Gates Step 3 (packet).

1. Types

import type { ModelId, ScoreContext } from './scoring.js';
import type { CompletionResult, AnthropicTool } from '../integrations/claude.js';

/**
 * Options accepted by `routeRequest`.
 *
 * Phase 0: honours `maxTokens`, `systemPrompt`, `model`, `apiKey`, the
 * injection seams (`completionFn`, `fetchFn`, `logger`, `delayFn`), and
 * `ScoreContext` fields. Phase 1.5 will start reading more of the
 * scoring factors.
 */
export interface RouteOptions extends ScoreContext {
  /** Override max tokens; passed through to the adapter. */
  readonly maxTokens?: number;
  /** Optional system prompt. */
  readonly systemPrompt?: string;
  /** Override the specific model name (rarely useful in Phase 0). */
  readonly model?: string;
  /** Override the API key (mostly for tests). */
  readonly apiKey?: string;
  /** Tool descriptors; when provided, calls `createCompletionWithTools`. */
  readonly tools?: ReadonlyArray<AnthropicTool>;
  /**
   * Inject a custom `createCompletion` (Phase 0) / `createCompletionWithTools`
   * function. Defaults to the real wrapper from integrations/claude.ts.
   * Tests MUST inject a mock to avoid real network calls.
   */
  readonly completionFn?: CompletionFn;
  /** Inject a custom scoring function (tests / diagnostics). */
  readonly scoringFn?: ScoringFn;
  /** Inject a custom `fetch` (passed through to the adapter). */
  readonly fetchFn?: typeof fetch;
  /** Inject a custom logger (passed through to the adapter). */
  readonly logger?: (...args: unknown[]) => void;
  /** Inject a custom delay function (passed through to the adapter). */
  readonly delayFn?: (ms: number) => Promise<void>;
}

/** Pluggable completion callable. Matches `createCompletion` / `createCompletionWithTools`. */
export type CompletionFn = (
  prompt: string,
  options: {
    readonly model?: string;
    readonly maxTokens?: number;
    readonly systemPrompt?: string;
    readonly apiKey?: string;
    readonly fetchFn?: typeof fetch;
    readonly logger?: (...args: unknown[]) => void;
    readonly delayFn?: (ms: number) => Promise<void>;
  },
) => Promise<CompletionResult>;

/** Pluggable scoring callable. Matches `scoreIntent`. */
export type ScoringFn = (
  prompt: string,
  context?: ScoreContext,
) => { readonly winner: ModelId; readonly scores: Readonly<Record<ModelId, number>> };

/**
 * Result returned by `routeRequest` on success.
 *
 * Shape is minimal — Phase 1.5 can append fields (e.g. `costUsd`,
 * `modelsAttempted`, `fallbackDepth`) without breaking existing callers.
 */
export interface RouteResult {
  readonly model: ModelId;
  readonly content: string;
  readonly finishReason: string;
  readonly promptTokens: number;
  readonly completionTokens: number;
  readonly latencyMs: number;
}

/**
 * One attempted upstream call. Phase 0 always has exactly 1 in the
 * `attempts` array on failure; Phase 1.5 grows it to N.
 */
export interface FallbackAttempt {
  readonly model: ModelId;
  readonly error: Error;
}

/**
 * Raised when every member of the fallback chain failed.
 *
 * Phase 0: `attempts.length === 1` always (chain has one member — Claude).
 * Phase 1.5: `attempts.length` grows to match the number of models tried.
 *
 * The name intentionally reflects Phase 1.5 semantics; a single-member
 * "chain exhaustion" is just "the only call failed", but the class name
 * stays stable across the Phase 0 → 1.5 boundary so callers do not need
 * to change their `instanceof` checks.
 */
export class FallbackChainExhaustedError extends Error {
  readonly code = 'FALLBACK_CHAIN_EXHAUSTED' as const;
  readonly attempts: ReadonlyArray<FallbackAttempt>;
  readonly cause: Error | undefined;

  constructor(message: string, attempts: ReadonlyArray<FallbackAttempt>) { /* ... */ }
}

/** Phase 0 compile-time + runtime marker — asserts ADR-005 §Decision. */
export const ROUTER_PHASE_0_SHAPE: {
  readonly members: 1;
  readonly hasCircuitBreaker: false;
  readonly modelsSupported: readonly ['claude'];
};

Rationale

  • RouteOptions extends ScoreContext: arbitrary scoring-factor fields can be pre-populated today without a type error; Phase 1.5 starts consuming them invisibly.
  • CompletionFn as a seam: tests inject a stub. Pattern parallels fetchFn in claude.ts and scoringFn here — the module is purely configurable.
  • RouteResult fields model, content, finishReason, promptTokens, completionTokens, latencyMs: minimal projection of CompletionResult with model narrowed to ModelId. Callers that need more fields can pass through rawCompletion in Phase 1.5 (additive).
  • FallbackAttempt.error is Error not unknown: callers can instanceof-check without a narrow. Phase 0 stores either AnthropicApiError, AnthropicConfigError, or (exotically) a plain Error if the mock throws one.
  • ROUTER_PHASE_0_SHAPE: a single-line marker whose shape’s literals are the ADR-005 invariants. A test asserts expect(ROUTER_PHASE_0_SHAPE.members).toBe(1) — if Phase 1.5 ships without updating this marker, the test fails. Graduation requires both: (a) updating the impl, and (b) updating the marker.

2. Function signature

/**
 * Route a prompt to a model and return the completion.
 *
 * Phase 0: single-member chain (Claude only). Picks winner via
 * `scoreIntent` (always returns 'claude'), delegates to `createCompletion`,
 * and throws `FallbackChainExhaustedError` on failure.
 */
export async function routeRequest(
  prompt: string,
  options?: RouteOptions,
): Promise<RouteResult>;

3. Invariants (ADR-005 §Decision)

# Invariant Enforcement
I1 Exactly one upstream call per routeRequest invocation, regardless of success or failure. Single await completionFn() in the impl; no retry loop around it.
I2 Winner is always 'claude' in Phase 0. Delegates to scoreIntent, whose P0.5.1 invariant guarantees 'claude'.
I3 On completionFn success, returns RouteResult with model === 'claude'. Impl writes 'claude' explicitly (not from upstream result.model, because upstream model name may be e.g. 'claude-sonnet-4-5' — a specific version, not the abstract router ID).
I4 On completionFn error, throws FallbackChainExhaustedError. try/catch wraps the upstream call; every caught error becomes a FallbackChainExhaustedError with a 1-entry attempts array.
I5 err.attempts.length === 1 on every Phase 0 failure. Single-member chain has exactly one attempt. Test asserts this explicitly.
I6 err.attempts[0].model === 'claude'. Only candidate in Phase 0.
I7 err.attempts[0].error preserves the original error (both AnthropicApiError and AnthropicConfigError). err.attempts[0].error is the exact Error instance thrown by completionFn. Not re-wrapped.
I8 No setTimeout / setInterval / Date.now()-based circuit breaker state. Impl uses no timers, no module-level state. Pure function of (prompt, options) + the upstream call.
I9 No MCP tool registered. Impl exports only functions/classes/types; src/server.ts is not imported.
I10 No env var reads. Impl reads no process.env directly; ANTHROPIC_API_KEY flows through createCompletion inside the adapter.
I11 ROUTER_PHASE_0_SHAPE.members === 1 and .hasCircuitBreaker === false. Literal-type frozen const.
I12 Deterministic routing: two calls with identical prompt (and any context) route to the same model. Transitively true because scoreIntent is constant.
I13 Tools passthrough: when options.tools is provided and non-empty, the call routes to createCompletionWithTools instead of createCompletion. routeRequest selects the upstream fn based on options.tools; default injection is a dispatcher that picks the correct wrapper.

4. Error taxonomy

Error raised by Propagates to caller as attempts[0].error
AnthropicApiError (429 / 5xx exhausted, 4xx terminal, network) FallbackChainExhaustedError the same AnthropicApiError instance
AnthropicConfigError (missing API key) FallbackChainExhaustedError the same AnthropicConfigError instance
Mock/stub throws generic Error FallbackChainExhaustedError the generic Error instance
Thrown non-Error value (e.g. throw 'oops') FallbackChainExhaustedError a synthesized Error with message String(value)

FallbackChainExhaustedError.message format:

"δ fallback chain exhausted after 1 attempt: [claude] <original-error-message>"

Rationale: single line, human-readable, includes the single attempted model name and the original error’s message. Phase 1.5 will list all N attempts in the message.

5. Invocation paths

Path A — no tools (standard completion)

  1. Caller: await routeRequest("Say hello.", { maxTokens: 256 })
  2. Router: scoreIntent(prompt, context){ winner: 'claude', ... }
  3. Router: completionFn("Say hello.", { maxTokens: 256, ...passthrough })CompletionResult
  4. Router: project CompletionResultRouteResult { model: 'claude', content, ... }
  5. Return.

Path B — with tools (tool-use completion)

  1. Caller: await routeRequest("What's the weather?", { tools: [...] })
  2. Router: scoreIntent(...){ winner: 'claude', ... }
  3. Router: completionFn dispatches to createCompletionWithTools internally (default injection honors this; test injection receives the tools array in options passthrough).
  4. Router: project → RouteResult.
  5. Return.

Path C — failure (Claude terminal error or exhausted retries)

  1. Caller: await routeRequest("...")
  2. Router: scoreIntent(...)'claude'
  3. Router: completionFn(...) throws AnthropicApiError
  4. Router: catch, wrap in FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <api-error> }] }
  5. Throw — caller catches.

Path D — failure (missing API key)

  1. Caller: await routeRequest("...") (no apiKey in options, process.env.ANTHROPIC_API_KEY unset)
  2. Router: scoreIntent(...)'claude'
  3. Router: completionFn(...) throws AnthropicConfigError
  4. Router: catch, wrap in FallbackChainExhaustedError { attempts: [{ model: 'claude', error: <config-error> }] }
  5. Throw.

No Path E exists — no cascade to another model. Single-member invariant.

6. Acceptance criteria → test mapping

AC Test name (in fallback.test.ts) Assertion
AC1 — happy path returns RouteResult returns RouteResult with content and model='claude' on success expect(result.model).toBe('claude'); expect(result.content).toBe('Hello!')
AC2 — delegates to scoring consults scoreIntent before calling upstream scoring spy called once
AC3 — delegates to upstream calls completionFn exactly once on success completionFn spy called 1×
AC4 — forwards prompt passes prompt to upstream completionFn completionFn arg[0] === prompt
AC5 — forwards maxTokens passes maxTokens to upstream upstream opts .maxTokens matches
AC6 — forwards systemPrompt passes systemPrompt to upstream upstream opts .systemPrompt matches
AC7 — failure wraps into FallbackChainExhaustedError throws FallbackChainExhaustedError when upstream throws expect(p).rejects.toBeInstanceOf(FallbackChainExhaustedError)
AC8 — single attempt in chain FallbackChainExhaustedError has exactly 1 attempt err.attempts.length === 1
AC9 — attempt records claude as model attempts[0].model === 'claude' ditto
AC10 — preserves AnthropicApiError preserves AnthropicApiError in attempts[0].error instanceof AnthropicApiError
AC11 — preserves AnthropicConfigError preserves AnthropicConfigError in attempts[0].error instanceof AnthropicConfigError
AC12 — ZERO cascade on failure does not retry after upstream failure completionFn spy called exactly 1× even after throw
AC13 — determinism two identical calls route to same model both results .model === 'claude'
AC14 — ROUTER_PHASE_0_SHAPE marker ROUTER_PHASE_0_SHAPE asserts ADR-005 invariants members === 1, hasCircuitBreaker === false, modelsSupported contains only 'claude'
AC15 — tools passthrough passes tools array through to completionFn upstream opts .tools matches
AC16 — FallbackChainExhaustedError message error message includes attempt count and model err.message contains “1 attempt” and “claude”
AC17 — non-Error throw wraps non-Error thrown values into Error instances err.attempts[0].error instanceof Error

Target: 10–15 tests (some ACs collapse into shared tests). Actual count in the packet.

7. Forward-compatibility guarantees for Phase 1.5

  • RouteOptions extends an open ScoreContext — new scoring factors slot in without breaking callers.
  • ModelId widens additively in scoring.ts; RouteResult.model remains assignable from the widened union.
  • FallbackChainExhaustedError.attempts grows in length, not shape.
  • ROUTER_PHASE_0_SHAPE type literals change to reflect Phase 1.5 (members: N, hasCircuitBreaker: true). This is an intentional breaking change of the marker const — it signals “Phase 0 is over.”
  • completionFn, scoringFn, fetchFn, logger, delayFn remain as injection seams. Phase 1.5 adapters (Kimi, Codex, OpenAI) attach to a separate adapters/ directory and plug into the fallback chain through the same seam.

8. Out-of-scope (explicitly)

  • MCP tool registration (router_call, router_fallback) — Phase 1.5.
  • Multi-model adapters — Phase 1.5.
  • Circuit breaker — Phase 1.5.
  • Cost / latency tracking — Phase 1.5.
  • Cross-call state (sessions, rate-limit tracking, per-model availability) — Phase 1.5.
  • Scoring-driven ordering — trivial in Phase 0 (1 model).

9. Conclusion

The contract above encodes ADR-005 §Decision invariants as types + test assertions. Step 3 (packet) lays out the three-file implementation plan and the 10–15 tests. Step 4 implements; Step 5 verifies.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.