P1.5.5 — N-member Fallback Chain + Circuit Breaker — Audit
Round: R92 Wave 4 of 7
Branch: feature/p1-5-5-fallback-cb
Base: origin/main @ 94ce7f8c
Dispatch: docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md §P1.5.5
Scope override: Wave 3 fold-in (re-export 3 adapters from src/domains/router/index.ts).
1. Goal
Graduate the δ Model Router from the Phase 0 single-member stub (P0.5.2, R75 Wave I) to a real N-member cascade gated by an in-memory circuit breaker. Replace the Phase 0 body of routeRequest so that scoring-descending order drives the chain, each attempt is bounded by a 30-second timeout, and three consecutive failures on the same model_id open a 60-second cooldown.
Additional Wave 3 fold-in: add three adapter re-exports to src/domains/router/index.ts so the router barrel exposes createKimiCompletion, createCodexCompletion, and createOpenAiCompletion along with their option / error types. The three sibling W3 PRs (#253/#254/#255) intentionally skipped editing index.ts to avoid a three-way merge race; P1.5.5 picks the fold-in up because it is the first downstream consumer of all three adapters.
2. Surface inventory
2.1. Files to modify
| Path | Current state | Change |
|---|---|---|
src/domains/router/fallback.ts |
Phase 0 single-call body (357 lines). 13 invariants documented. ROUTER_PHASE_0_SHAPE.members === 1, hasCircuitBreaker === false, modelsSupported === ['claude']. |
Replace body of routeRequest. Flip ROUTER_PHASE_0_SHAPE literals. Add adapter registry. Export getCircuitBreakerState() and resetCircuitBreaker(modelId?) re-exported from ./circuit.js. |
src/domains/router/index.ts |
12 lines. Re-exports ./scoring.js + ./fallback.js. |
Add export * lines for the 3 adapter modules (./adapters/kimi.js, ./adapters/codex.js, ./adapters/openai.js). |
src/__tests__/domains/router/fallback.test.ts |
615 lines covering 17 Phase 0 ACs. Asserts ROUTER_PHASE_0_SHAPE.members === 1 and the ZERO-cascade invariant. |
Drop Phase-0-only assertions (members === 1, “calls completionFn exactly 1×”). Add cascade / CB / timeout / shape-flip coverage. |
2.2. Files to create
| Path | Purpose |
|---|---|
src/domains/router/circuit.ts |
In-memory per-model circuit breaker state. Constants CIRCUIT_FAILURE_THRESHOLD = 3 and CIRCUIT_COOLDOWN_MS = 60_000. Pure module — nowFn injectable for fake-timer tests. Exports recordFailure, recordSuccess, isOpen, resetIfElapsed, snapshot, resetCircuitBreaker, getCircuitBreakerState. |
src/__tests__/domains/router/circuit.test.ts |
Trip + reset + manual-reset + snapshot tests. Uses an injected nowFn (no Jest fake timers needed since the module clock is pluggable). |
2.3. Files NOT modified
src/domains/router/adapters/{kimi,codex,openai}.ts— adapters are W3 territory and stable.src/domains/router/scoring.ts— P1.5.5 importsscoreIntentandModelId; no edits.src/domains/router/scoring-weights.ts— referenced indirectly via scoring.src/domains/integrations/claude.ts— adapters reference this; not touched.src/db/*— CB state is in-memory only this round (per dispatch forbiddens).src/server.ts— no MCP tool registration this round (P1.5.7 scope).
3. Exports that MUST be preserved (signatures unchanged)
From src/domains/router/fallback.ts:
routeRequest(prompt, options?) → Promise<RouteResult>FallbackChainExhaustedError(class —attemptsandcauseproperties preserved)RouteOptionsinterfaceRouteResultinterfaceFallbackAttemptinterfaceCompletionFntypeCompletionFnOptionsinterfaceScoringFntype
The dispatch packet explicitly forbids appending costUsd or modelsAttempted to RouteResult — those land in P1.5.6 W5.
The dispatch packet allows new exports — specifically getCircuitBreakerState() and resetCircuitBreaker(modelId?) — for consumption by P1.5.7’s router_fallback tool.
4. Adapter signatures (from W3 PRs)
| Adapter | Plain entry | Tools entry | Error config class | Error API class | Env key |
|---|---|---|---|---|---|
| Claude (W0) | createCompletion(prompt, opts) |
createCompletionWithTools(prompt, tools, opts) |
AnthropicConfigError |
AnthropicApiError |
ANTHROPIC_API_KEY |
| Kimi (W3 #254) | createKimiCompletion(prompt, opts) |
createKimiCompletionWithTools(prompt, tools, opts) |
KimiConfigError |
KimiApiError |
COLIBRI_KIMI_API_KEY |
| Codex (W3 #253) | createCodexCompletion(prompt, opts) |
createCodexCompletionWithTools(prompt, tools, opts) |
CodexConfigError |
CodexApiError |
COLIBRI_CODEX_API_KEY |
| OpenAI (W3 #255) | createOpenAiCompletion(prompt, opts) |
createOpenAiCompletionWithTools(prompt, tools, opts) |
OpenAiConfigError |
OpenAiApiError |
COLIBRI_OPENAI_API_KEY |
All four adapters return CompletionResult (from ../integrations/claude.ts). Their plain entry-point signatures align with CompletionFn. OpenAI’s tools entry takes ReadonlyArray<OpenAiTool> (NOT AnthropicTool[]) while Kimi/Codex/Claude take AnthropicTool[]. P1.5.5 routes via the plain entry because P1.5.5 does NOT extend tool routing into the multi-adapter case (the existing defaultCompletionFn dispatcher remains Claude-only for tools — a future P1.5.6+ task may generalize it). Tools support across adapters is out of scope for P1.5.5.
5. ModelId union (current)
From src/domains/router/scoring.ts:
export type ModelId =
| 'claude'
| 'claude-sonnet-3-5'
| 'claude-haiku-3-5'
| 'gpt-4o'
| 'gpt-4o-mini'
| 'gemini-1-5-pro'
| 'llama-3-3-70b'
| 'mixtral-8x22b'
| 'kimi-k2';
ModelId is unchanged in P1.5.5. The router’s chain order is derived from scoreIntent().scores sorted descending. In practice the candidate cohort visible to the router is whatever the caller injects via context.candidatesSnapshot; when the snapshot is omitted, the scoring path returns the Phase 0 frozen constant (winner 'claude').
The adapter registry maps from ModelId to a CompletionFn. The current cohort with shipped adapters is:
'claude'→createCompletion(W0)'claude-sonnet-3-5'→createCompletion(W0; treated as the same upstream as the abstractclaudeID — the version override lives inoptions.model)'claude-haiku-3-5'→createCompletion(W0; same — variant viaoptions.model)'kimi-k2'→createKimiCompletion(W3)'gpt-4o'→createOpenAiCompletion(W3)'gpt-4o-mini'→createOpenAiCompletion(W3; variant viaoptions.model)
The three currently-unshipped IDs ('gemini-1-5-pro', 'llama-3-3-70b', 'mixtral-8x22b') have NO adapter as of R92 Wave 4. The registry will return undefined for these and the router treats that as “no adapter available” — equivalent to a permanently-tripped breaker, recorded in attempts with a typed error so chain exhaustion is observable.
The Codex adapter (createCodexCompletion) is wired into the registry for completeness (re-exported in the fold-in) but no current ModelId maps to it directly; it ships ahead of the matching DB seed (Codex is added under a future ModelId expansion). It is re-exported from the barrel, available for downstream callers and for the test harness, but does NOT participate in the default chain order.
6. ROUTER_PHASE_0_SHAPE flip plan
Current literals (Phase 0):
export const ROUTER_PHASE_0_SHAPE: {
readonly members: 1;
readonly hasCircuitBreaker: false;
readonly modelsSupported: readonly ['claude'];
} = Object.freeze({
members: 1,
hasCircuitBreaker: false,
modelsSupported: Object.freeze(['claude'] as const),
} as const);
Phase 1.5 literals (after flip):
The flipped marker reflects the adapter-bound chain (the set of ModelIds for which a concrete CompletionFn exists in the registry). With the four currently-shipped adapters that’s 6 entries:
'claude','claude-sonnet-3-5','claude-haiku-3-5'(all viacreateCompletion)'kimi-k2'(viacreateKimiCompletion)'gpt-4o','gpt-4o-mini'(viacreateOpenAiCompletion)
(The three Codex-/unshipped-adapter IDs are absent from this list — modelsSupported is “what the chain can actually try”, not “every ModelId the type union admits”.)
export const ROUTER_PHASE_0_SHAPE: {
readonly members: 6;
readonly hasCircuitBreaker: true;
readonly modelsSupported: readonly [
'claude',
'claude-haiku-3-5',
'claude-sonnet-3-5',
'gpt-4o',
'gpt-4o-mini',
'kimi-k2',
];
} = Object.freeze({ ... } as const);
The shape name is preserved for backward-compatible imports — callers that asserted Phase 0 literals fail loudly (as designed by the original P0.5.2 contract — “If you find yourself leaving members: 1 to keep old tests passing, you’ve missed the whole point of the marker”).
7. Circuit breaker semantics (target)
Per the dispatch packet and P0.5.2 heritage (with namespace flip):
- State per model:
{ failures: number, openedAt: number | null }. Stored in an in-memoryMap<ModelId, CircuitState>. No DB write. recordFailure(modelId): incrementsfailures. Atfailures >= CIRCUIT_FAILURE_THRESHOLD (=3), setsopenedAt = now.recordSuccess(modelId): clearsfailuresto 0 (does NOT clearopenedAt— the open-state cooldown is time-bound, not success-bound). A success during the open window therefore neither bypasses the cooldown nor extends it; it only resets the failure counter once the window has expired and the breaker has reset.isOpen(modelId, now):trueiffopenedAt !== nullAND(now - openedAt) < CIRCUIT_COOLDOWN_MS (=60_000).resetIfElapsed(modelId, now): when the 60s window has passed (openedAt !== null && (now - openedAt) >= COOLDOWN), clearopenedAtandfailures. This is the “time-bound reset” — called at the start of eachrouteRequestwalk for every candidate before theisOpencheck.resetCircuitBreaker(modelId?): manual reset. With argument: clear that model. Without: clear the whole map. For P1.5.7’srouter_fallbackMCP tool.snapshot(): returns aReadonlyMap<ModelId, Readonly<CircuitState>>for observability (also re-exported asgetCircuitBreakerState).- Clock injection:
nowFn(): numberparameter to every function that consults the clock. Defaults toDate.now. Tests inject a deterministic clock (no need for Jest fake timers).
8. Timeout semantics
- Per-attempt 30s timeout (
COLIBRI_MODEL_TIMEOUT). Default30_000ms. - Implemented via
Promise.race([adapterCall, timeoutPromise])+ anAbortControllerthat the timeout branch signals before throwing. Long-lived adapter promises that resolve after the timeout deadline must therefore not leak (the AbortController is unconditionally aborted on timer fire; if the adapter doesn’t honour the signal it still resolves eventually, but its result is discarded by the race). - The timeout treats as a failure attempt —
recordFailure(modelId)is called and the attempt is appended toattemptswith a typedRouterTimeoutError. The next chain member is tried. - No
setTimeoutoutside thePromise.raceguard, per dispatch forbiddens.
9. Chain-walk algorithm
async routeRequest(prompt, options):
scoring = options.scoringFn ?? scoreIntent
decision = scoring(prompt, options)
chainOrder = sort(Object.entries(decision.scores), descending by score)
.map(([modelId, _]) => modelId)
// ensure stable ordering for ties (scoring already does ASCII asc tie-break,
// so the descending sort yields a deterministic chain)
attempts = []
completionFn = options.completionFn // tests inject this directly
timeoutMs = readTimeoutEnv()
for modelId in chainOrder:
resetIfElapsed(modelId, nowFn())
if isOpen(modelId, nowFn()):
attempts.push({ model: modelId, error: CircuitOpenError(modelId) })
continue
adapter = completionFn ?? adapterRegistry[modelId]
if adapter is undefined:
attempts.push({ model: modelId, error: NoAdapterError(modelId) })
continue
try:
upstream = await raceWithTimeout(
adapter(prompt, projectUpstreamOptions(options, modelId)),
timeoutMs,
)
recordSuccess(modelId)
return RouteResult.from(modelId, upstream)
catch err:
recordFailure(modelId)
attempts.push({ model: modelId, error: err })
throw FallbackChainExhaustedError(attempts)
The options.completionFn injection seam is preserved for tests — when present, it shadows the adapter registry for every chain member (P0.5.2 tests injected it; P1.5.5 tests injecting it observe the cascade behavior because the mock fn can throw on the first call and succeed on subsequent ones, distinguished by an internal counter or sequence).
If we want per-model mock dispatch in tests, we expose a parallel options.completionFnRegistry?: Partial<Record<ModelId, CompletionFn>> seam, falling back to options.completionFn if the per-model entry is missing. This keeps P0.5.2-style tests compiling unchanged while enabling true cascade tests.
10. Phase 0 test assertions that will fail post-flip (deliberate breakage)
From src/__tests__/domains/router/fallback.test.ts:
| Line | Assertion | Action |
|---|---|---|
| 434 | expect(ROUTER_PHASE_0_SHAPE.members).toBe(1) |
Rewrite to assert members >= 4 and members === modelsSupported.length. |
| 437–438 | expect(ROUTER_PHASE_0_SHAPE.hasCircuitBreaker).toBe(false) |
Flip to .toBe(true). |
| 441–443 | expect(ROUTER_PHASE_0_SHAPE.modelsSupported).toEqual(['claude']) and toHaveLength(1) |
Replace with assertion that modelsSupported includes 'claude' and has length >= 4 (exact shape pinned in a parallel test that mirrors the literal). |
| 306 | expect(caught?.attempts).toHaveLength(1) |
Replace with chain-exhaustion test (length N) and add a per-attempt cascade test (length 2 on fail-then-succeed path — but that path no longer throws, so it asserts RouteResult.model === 'B' and that mockFn was called for both models). |
| 383 | expect(calls).toHaveLength(1) (ZERO-cascade invariant) |
DELETE. ADR-005 Phase 0 invariant retired by the flip. Replace with cascade-progression test: when the per-model completionFn registry exposes A→fails / B→succeeds, both adapters are called. |
| 396 | Same shape as above for AnthropicConfigError |
DELETE / replace identically. |
| 416–425 | “different prompts route to the same model (Phase 0 single-member)” | DELETE. Phase 1.5 chain order is prompt-sensitive via scoring. |
The tests for happy-path RouteResult shape (AC1, AC3, AC4–AC6, AC10–AC11, AC15–AC17) all stay as-is — the contract honours the same RouteResult/FallbackAttempt/RouteOptions shapes, and attempts[0].model === 'claude' remains true on the typical “scoring picks claude, claude fails” path (the chain order has claude first because the default Phase 1.5 scoring with no candidates snapshot still returns claude=1.0).
The “default dispatcher” tests (using a real fetchFn mock) continue to exercise the Claude adapter path because the scoring’s empty-snapshot fallback keeps claude as the winner, and the first chain member is therefore claude. Those tests remain unmodified.
11. Heritage references
- P0.5.2 dispatch packet (
docs/guides/implementation/task-prompts/p0.5-delta-router.md§P0.5.2) — 8-model fallback + CB; namespace flipped (AMS_MODEL_*→COLIBRI_*). - ADR-005 §Decision §Implementation step 3 — flip
ROUTER_PHASE_0_SHAPEliterals as the trip-wire; signal intentional Phase boundary crossing. - δ concept doc (
docs/3-world/social/llm.md§Fallback chain) — chain ordering, ζ recording per attempt (ζ recording itself is out of scope for P1.5.5 — landing in P1.5.7’srouter_fallbacktool emission per W6). - Existing P0.5.2 contract / packet (
docs/contracts/p0-5-2-fallback-contract.md/docs/packets/p0-5-2-fallback-packet.md) — Phase 0 17-AC list; the post-flip ACs replace ACs 8, 9, 12, 13, 14 wholesale.
12. Risks
| Risk | Likelihood | Mitigation |
|---|---|---|
Promise.race leaks if adapter resolves post-timeout |
High without AbortController | Pass signal to adapter; abort unconditionally on timer fire. |
| Tests using Jest fake timers cross-leak into circuit state | Medium | Inject nowFn per call; default Date.now only used in production path. |
| ROUTER_PHASE_0_SHAPE literal flip breaks downstream consumers in unrelated test files | Low (no current callers outside fallback.test.ts) | Grep for ROUTER_PHASE_0_SHAPE; only the single test file references it. |
defaultCompletionFn dispatcher gets routed to non-claude adapter but uses Claude-shaped tools |
Low | P1.5.5 keeps tools-path dispatch Claude-only (matches existing AC15). |
| Adapter registry init runs at module load and forces all adapter imports to evaluate | Low | The adapter modules are pure-import safe — their process.env reads happen at call-time per the W3 adapters. |
Fold-in re-export name collision (CompletionResult exported by three sources) |
Medium | OpenAI exports CompletionResult directly; the others re-export from ../integrations/claude.js. Wave 3 fold-in: use a Kimi-first export * ordering so the symbol resolves via the Claude integration’s original; deconflict by named subset re-export if TypeScript reports an ambient conflict. Verified empirically during Step 4. |
13. Acceptance criteria (from dispatch + P1.5.5 staging)
routeRequestsignature + return shape unchanged.- Chain order from
scoreIntent(descending). - Per-attempt 30s timeout via
COLIBRI_MODEL_TIMEOUT. - CB: 3 consecutive fails → 60s open; time-bound reset; per-model.
- All-tripped →
FallbackChainExhaustedErrorwith N attempts. ROUTER_PHASE_0_SHAPEliterals flipped (members, hasCircuitBreaker, modelsSupported).getCircuitBreakerState()+resetCircuitBreaker(modelId?)exports present.- Wave-3 fold-in:
src/domains/router/index.tsre-exports adapters. npm run build && npm run lint && npm testgreen.
14. Audit close
Inventory complete. Ready to write the behavioral contract (Step 2).