P1.5.5 — N-member Fallback Chain + Circuit Breaker — Audit

Round: R92 Wave 4 of 7 Branch: feature/p1-5-5-fallback-cb Base: origin/main @ 94ce7f8c Dispatch: docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md §P1.5.5 Scope override: Wave 3 fold-in (re-export 3 adapters from src/domains/router/index.ts).

1. Goal

Graduate the δ Model Router from the Phase 0 single-member stub (P0.5.2, R75 Wave I) to a real N-member cascade gated by an in-memory circuit breaker. Replace the Phase 0 body of routeRequest so that scoring-descending order drives the chain, each attempt is bounded by a 30-second timeout, and three consecutive failures on the same model_id open a 60-second cooldown.

Additional Wave 3 fold-in: add three adapter re-exports to src/domains/router/index.ts so the router barrel exposes createKimiCompletion, createCodexCompletion, and createOpenAiCompletion along with their option / error types. The three sibling W3 PRs (#253/#254/#255) intentionally skipped editing index.ts to avoid a three-way merge race; P1.5.5 picks the fold-in up because it is the first downstream consumer of all three adapters.

2. Surface inventory

2.1. Files to modify

Path Current state Change
src/domains/router/fallback.ts Phase 0 single-call body (357 lines). 13 invariants documented. ROUTER_PHASE_0_SHAPE.members === 1, hasCircuitBreaker === false, modelsSupported === ['claude']. Replace body of routeRequest. Flip ROUTER_PHASE_0_SHAPE literals. Add adapter registry. Export getCircuitBreakerState() and resetCircuitBreaker(modelId?) re-exported from ./circuit.js.
src/domains/router/index.ts 12 lines. Re-exports ./scoring.js + ./fallback.js. Add export * lines for the 3 adapter modules (./adapters/kimi.js, ./adapters/codex.js, ./adapters/openai.js).
src/__tests__/domains/router/fallback.test.ts 615 lines covering 17 Phase 0 ACs. Asserts ROUTER_PHASE_0_SHAPE.members === 1 and the ZERO-cascade invariant. Drop Phase-0-only assertions (members === 1, “calls completionFn exactly 1×”). Add cascade / CB / timeout / shape-flip coverage.

2.2. Files to create

Path Purpose
src/domains/router/circuit.ts In-memory per-model circuit breaker state. Constants CIRCUIT_FAILURE_THRESHOLD = 3 and CIRCUIT_COOLDOWN_MS = 60_000. Pure module — nowFn injectable for fake-timer tests. Exports recordFailure, recordSuccess, isOpen, resetIfElapsed, snapshot, resetCircuitBreaker, getCircuitBreakerState.
src/__tests__/domains/router/circuit.test.ts Trip + reset + manual-reset + snapshot tests. Uses an injected nowFn (no Jest fake timers needed since the module clock is pluggable).

2.3. Files NOT modified

  • src/domains/router/adapters/{kimi,codex,openai}.ts — adapters are W3 territory and stable.
  • src/domains/router/scoring.ts — P1.5.5 imports scoreIntent and ModelId; no edits.
  • src/domains/router/scoring-weights.ts — referenced indirectly via scoring.
  • src/domains/integrations/claude.ts — adapters reference this; not touched.
  • src/db/* — CB state is in-memory only this round (per dispatch forbiddens).
  • src/server.ts — no MCP tool registration this round (P1.5.7 scope).

3. Exports that MUST be preserved (signatures unchanged)

From src/domains/router/fallback.ts:

  • routeRequest(prompt, options?) → Promise<RouteResult>
  • FallbackChainExhaustedError (class — attempts and cause properties preserved)
  • RouteOptions interface
  • RouteResult interface
  • FallbackAttempt interface
  • CompletionFn type
  • CompletionFnOptions interface
  • ScoringFn type

The dispatch packet explicitly forbids appending costUsd or modelsAttempted to RouteResult — those land in P1.5.6 W5.

The dispatch packet allows new exports — specifically getCircuitBreakerState() and resetCircuitBreaker(modelId?) — for consumption by P1.5.7’s router_fallback tool.

4. Adapter signatures (from W3 PRs)

Adapter Plain entry Tools entry Error config class Error API class Env key
Claude (W0) createCompletion(prompt, opts) createCompletionWithTools(prompt, tools, opts) AnthropicConfigError AnthropicApiError ANTHROPIC_API_KEY
Kimi (W3 #254) createKimiCompletion(prompt, opts) createKimiCompletionWithTools(prompt, tools, opts) KimiConfigError KimiApiError COLIBRI_KIMI_API_KEY
Codex (W3 #253) createCodexCompletion(prompt, opts) createCodexCompletionWithTools(prompt, tools, opts) CodexConfigError CodexApiError COLIBRI_CODEX_API_KEY
OpenAI (W3 #255) createOpenAiCompletion(prompt, opts) createOpenAiCompletionWithTools(prompt, tools, opts) OpenAiConfigError OpenAiApiError COLIBRI_OPENAI_API_KEY

All four adapters return CompletionResult (from ../integrations/claude.ts). Their plain entry-point signatures align with CompletionFn. OpenAI’s tools entry takes ReadonlyArray<OpenAiTool> (NOT AnthropicTool[]) while Kimi/Codex/Claude take AnthropicTool[]. P1.5.5 routes via the plain entry because P1.5.5 does NOT extend tool routing into the multi-adapter case (the existing defaultCompletionFn dispatcher remains Claude-only for tools — a future P1.5.6+ task may generalize it). Tools support across adapters is out of scope for P1.5.5.

5. ModelId union (current)

From src/domains/router/scoring.ts:

export type ModelId =
  | 'claude'
  | 'claude-sonnet-3-5'
  | 'claude-haiku-3-5'
  | 'gpt-4o'
  | 'gpt-4o-mini'
  | 'gemini-1-5-pro'
  | 'llama-3-3-70b'
  | 'mixtral-8x22b'
  | 'kimi-k2';

ModelId is unchanged in P1.5.5. The router’s chain order is derived from scoreIntent().scores sorted descending. In practice the candidate cohort visible to the router is whatever the caller injects via context.candidatesSnapshot; when the snapshot is omitted, the scoring path returns the Phase 0 frozen constant (winner 'claude').

The adapter registry maps from ModelId to a CompletionFn. The current cohort with shipped adapters is:

  • 'claude'createCompletion (W0)
  • 'claude-sonnet-3-5'createCompletion (W0; treated as the same upstream as the abstract claude ID — the version override lives in options.model)
  • 'claude-haiku-3-5'createCompletion (W0; same — variant via options.model)
  • 'kimi-k2'createKimiCompletion (W3)
  • 'gpt-4o'createOpenAiCompletion (W3)
  • 'gpt-4o-mini'createOpenAiCompletion (W3; variant via options.model)

The three currently-unshipped IDs ('gemini-1-5-pro', 'llama-3-3-70b', 'mixtral-8x22b') have NO adapter as of R92 Wave 4. The registry will return undefined for these and the router treats that as “no adapter available” — equivalent to a permanently-tripped breaker, recorded in attempts with a typed error so chain exhaustion is observable.

The Codex adapter (createCodexCompletion) is wired into the registry for completeness (re-exported in the fold-in) but no current ModelId maps to it directly; it ships ahead of the matching DB seed (Codex is added under a future ModelId expansion). It is re-exported from the barrel, available for downstream callers and for the test harness, but does NOT participate in the default chain order.

6. ROUTER_PHASE_0_SHAPE flip plan

Current literals (Phase 0):

export const ROUTER_PHASE_0_SHAPE: {
  readonly members: 1;
  readonly hasCircuitBreaker: false;
  readonly modelsSupported: readonly ['claude'];
} = Object.freeze({
  members: 1,
  hasCircuitBreaker: false,
  modelsSupported: Object.freeze(['claude'] as const),
} as const);

Phase 1.5 literals (after flip):

The flipped marker reflects the adapter-bound chain (the set of ModelIds for which a concrete CompletionFn exists in the registry). With the four currently-shipped adapters that’s 6 entries:

  • 'claude', 'claude-sonnet-3-5', 'claude-haiku-3-5' (all via createCompletion)
  • 'kimi-k2' (via createKimiCompletion)
  • 'gpt-4o', 'gpt-4o-mini' (via createOpenAiCompletion)

(The three Codex-/unshipped-adapter IDs are absent from this list — modelsSupported is “what the chain can actually try”, not “every ModelId the type union admits”.)

export const ROUTER_PHASE_0_SHAPE: {
  readonly members: 6;
  readonly hasCircuitBreaker: true;
  readonly modelsSupported: readonly [
    'claude',
    'claude-haiku-3-5',
    'claude-sonnet-3-5',
    'gpt-4o',
    'gpt-4o-mini',
    'kimi-k2',
  ];
} = Object.freeze({ ... } as const);

The shape name is preserved for backward-compatible imports — callers that asserted Phase 0 literals fail loudly (as designed by the original P0.5.2 contract — “If you find yourself leaving members: 1 to keep old tests passing, you’ve missed the whole point of the marker”).

7. Circuit breaker semantics (target)

Per the dispatch packet and P0.5.2 heritage (with namespace flip):

  • State per model: { failures: number, openedAt: number | null }. Stored in an in-memory Map<ModelId, CircuitState>. No DB write.
  • recordFailure(modelId): increments failures. At failures >= CIRCUIT_FAILURE_THRESHOLD (=3), sets openedAt = now.
  • recordSuccess(modelId): clears failures to 0 (does NOT clear openedAt — the open-state cooldown is time-bound, not success-bound). A success during the open window therefore neither bypasses the cooldown nor extends it; it only resets the failure counter once the window has expired and the breaker has reset.
  • isOpen(modelId, now): true iff openedAt !== null AND (now - openedAt) < CIRCUIT_COOLDOWN_MS (=60_000).
  • resetIfElapsed(modelId, now): when the 60s window has passed (openedAt !== null && (now - openedAt) >= COOLDOWN), clear openedAt and failures. This is the “time-bound reset” — called at the start of each routeRequest walk for every candidate before the isOpen check.
  • resetCircuitBreaker(modelId?): manual reset. With argument: clear that model. Without: clear the whole map. For P1.5.7’s router_fallback MCP tool.
  • snapshot(): returns a ReadonlyMap<ModelId, Readonly<CircuitState>> for observability (also re-exported as getCircuitBreakerState).
  • Clock injection: nowFn(): number parameter to every function that consults the clock. Defaults to Date.now. Tests inject a deterministic clock (no need for Jest fake timers).

8. Timeout semantics

  • Per-attempt 30s timeout (COLIBRI_MODEL_TIMEOUT). Default 30_000 ms.
  • Implemented via Promise.race([adapterCall, timeoutPromise]) + an AbortController that the timeout branch signals before throwing. Long-lived adapter promises that resolve after the timeout deadline must therefore not leak (the AbortController is unconditionally aborted on timer fire; if the adapter doesn’t honour the signal it still resolves eventually, but its result is discarded by the race).
  • The timeout treats as a failure attemptrecordFailure(modelId) is called and the attempt is appended to attempts with a typed RouterTimeoutError. The next chain member is tried.
  • No setTimeout outside the Promise.race guard, per dispatch forbiddens.

9. Chain-walk algorithm

async routeRequest(prompt, options):
    scoring = options.scoringFn ?? scoreIntent
    decision = scoring(prompt, options)
    chainOrder = sort(Object.entries(decision.scores), descending by score)
                  .map(([modelId, _]) => modelId)
                  // ensure stable ordering for ties (scoring already does ASCII asc tie-break,
                  //  so the descending sort yields a deterministic chain)

    attempts = []
    completionFn = options.completionFn  // tests inject this directly
    timeoutMs = readTimeoutEnv()

    for modelId in chainOrder:
        resetIfElapsed(modelId, nowFn())
        if isOpen(modelId, nowFn()):
            attempts.push({ model: modelId, error: CircuitOpenError(modelId) })
            continue

        adapter = completionFn ?? adapterRegistry[modelId]
        if adapter is undefined:
            attempts.push({ model: modelId, error: NoAdapterError(modelId) })
            continue

        try:
            upstream = await raceWithTimeout(
                adapter(prompt, projectUpstreamOptions(options, modelId)),
                timeoutMs,
            )
            recordSuccess(modelId)
            return RouteResult.from(modelId, upstream)
        catch err:
            recordFailure(modelId)
            attempts.push({ model: modelId, error: err })

    throw FallbackChainExhaustedError(attempts)

The options.completionFn injection seam is preserved for tests — when present, it shadows the adapter registry for every chain member (P0.5.2 tests injected it; P1.5.5 tests injecting it observe the cascade behavior because the mock fn can throw on the first call and succeed on subsequent ones, distinguished by an internal counter or sequence).

If we want per-model mock dispatch in tests, we expose a parallel options.completionFnRegistry?: Partial<Record<ModelId, CompletionFn>> seam, falling back to options.completionFn if the per-model entry is missing. This keeps P0.5.2-style tests compiling unchanged while enabling true cascade tests.

10. Phase 0 test assertions that will fail post-flip (deliberate breakage)

From src/__tests__/domains/router/fallback.test.ts:

Line Assertion Action
434 expect(ROUTER_PHASE_0_SHAPE.members).toBe(1) Rewrite to assert members >= 4 and members === modelsSupported.length.
437–438 expect(ROUTER_PHASE_0_SHAPE.hasCircuitBreaker).toBe(false) Flip to .toBe(true).
441–443 expect(ROUTER_PHASE_0_SHAPE.modelsSupported).toEqual(['claude']) and toHaveLength(1) Replace with assertion that modelsSupported includes 'claude' and has length >= 4 (exact shape pinned in a parallel test that mirrors the literal).
306 expect(caught?.attempts).toHaveLength(1) Replace with chain-exhaustion test (length N) and add a per-attempt cascade test (length 2 on fail-then-succeed path — but that path no longer throws, so it asserts RouteResult.model === 'B' and that mockFn was called for both models).
383 expect(calls).toHaveLength(1) (ZERO-cascade invariant) DELETE. ADR-005 Phase 0 invariant retired by the flip. Replace with cascade-progression test: when the per-model completionFn registry exposes A→fails / B→succeeds, both adapters are called.
396 Same shape as above for AnthropicConfigError DELETE / replace identically.
416–425 “different prompts route to the same model (Phase 0 single-member)” DELETE. Phase 1.5 chain order is prompt-sensitive via scoring.

The tests for happy-path RouteResult shape (AC1, AC3, AC4–AC6, AC10–AC11, AC15–AC17) all stay as-is — the contract honours the same RouteResult/FallbackAttempt/RouteOptions shapes, and attempts[0].model === 'claude' remains true on the typical “scoring picks claude, claude fails” path (the chain order has claude first because the default Phase 1.5 scoring with no candidates snapshot still returns claude=1.0).

The “default dispatcher” tests (using a real fetchFn mock) continue to exercise the Claude adapter path because the scoring’s empty-snapshot fallback keeps claude as the winner, and the first chain member is therefore claude. Those tests remain unmodified.

11. Heritage references

  • P0.5.2 dispatch packet (docs/guides/implementation/task-prompts/p0.5-delta-router.md §P0.5.2) — 8-model fallback + CB; namespace flipped (AMS_MODEL_*COLIBRI_*).
  • ADR-005 §Decision §Implementation step 3 — flip ROUTER_PHASE_0_SHAPE literals as the trip-wire; signal intentional Phase boundary crossing.
  • δ concept doc (docs/3-world/social/llm.md §Fallback chain) — chain ordering, ζ recording per attempt (ζ recording itself is out of scope for P1.5.5 — landing in P1.5.7’s router_fallback tool emission per W6).
  • Existing P0.5.2 contract / packet (docs/contracts/p0-5-2-fallback-contract.md / docs/packets/p0-5-2-fallback-packet.md) — Phase 0 17-AC list; the post-flip ACs replace ACs 8, 9, 12, 13, 14 wholesale.

12. Risks

Risk Likelihood Mitigation
Promise.race leaks if adapter resolves post-timeout High without AbortController Pass signal to adapter; abort unconditionally on timer fire.
Tests using Jest fake timers cross-leak into circuit state Medium Inject nowFn per call; default Date.now only used in production path.
ROUTER_PHASE_0_SHAPE literal flip breaks downstream consumers in unrelated test files Low (no current callers outside fallback.test.ts) Grep for ROUTER_PHASE_0_SHAPE; only the single test file references it.
defaultCompletionFn dispatcher gets routed to non-claude adapter but uses Claude-shaped tools Low P1.5.5 keeps tools-path dispatch Claude-only (matches existing AC15).
Adapter registry init runs at module load and forces all adapter imports to evaluate Low The adapter modules are pure-import safe — their process.env reads happen at call-time per the W3 adapters.
Fold-in re-export name collision (CompletionResult exported by three sources) Medium OpenAI exports CompletionResult directly; the others re-export from ../integrations/claude.js. Wave 3 fold-in: use a Kimi-first export * ordering so the symbol resolves via the Claude integration’s original; deconflict by named subset re-export if TypeScript reports an ambient conflict. Verified empirically during Step 4.

13. Acceptance criteria (from dispatch + P1.5.5 staging)

  1. routeRequest signature + return shape unchanged.
  2. Chain order from scoreIntent (descending).
  3. Per-attempt 30s timeout via COLIBRI_MODEL_TIMEOUT.
  4. CB: 3 consecutive fails → 60s open; time-bound reset; per-model.
  5. All-tripped → FallbackChainExhaustedError with N attempts.
  6. ROUTER_PHASE_0_SHAPE literals flipped (members, hasCircuitBreaker, modelsSupported).
  7. getCircuitBreakerState() + resetCircuitBreaker(modelId?) exports present.
  8. Wave-3 fold-in: src/domains/router/index.ts re-exports adapters.
  9. npm run build && npm run lint && npm test green.

14. Audit close

Inventory complete. Ready to write the behavioral contract (Step 2).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.