P1.5.5 — Execution Packet

1. File-by-file change manifest

1.1. Create src/domains/router/circuit.ts

Content sketch: module-level Map<ModelId, MutableCircuitState> with the seven exports from contract §1.1. nowFn injectable in every public function. The map is created with new Map() at module load, never replaced (so re-importing the module reuses the singleton state across the test process — except where tests call resetCircuitBreaker()).

Public surface (concrete signatures):

export const CIRCUIT_FAILURE_THRESHOLD = 3 as const;
export const CIRCUIT_COOLDOWN_MS = 60_000 as const;

export interface CircuitState {
  readonly failures: number;
  readonly openedAt: number | null;
}

export interface CircuitBreakerNowOptions {
  readonly nowFn?: () => number;
}

export function recordFailure(modelId: ModelId, opts?: CircuitBreakerNowOptions): void;
export function recordSuccess(modelId: ModelId): void;
export function isOpen(modelId: ModelId, opts?: CircuitBreakerNowOptions): boolean;
export function resetIfElapsed(modelId: ModelId, opts?: CircuitBreakerNowOptions): void;
export function resetCircuitBreaker(modelId?: ModelId): void;
export function snapshot(): ReadonlyMap<ModelId, CircuitState>;
export function getCircuitBreakerState(): ReadonlyMap<ModelId, CircuitState>;

Internal type:

interface MutableCircuitState {
  failures: number;
  openedAt: number | null;
}

The module returns frozen CircuitState views via Object.freeze({ failures, openedAt }) at the snapshot / iteration boundary. The internal map stores mutable shapes for fast update.

1.2. Modify src/domains/router/fallback.ts

Diff scope:

  1. Header docblock — replace the Phase 0 single-call narrative with the P1.5.5 N-member narrative. Keep the canonical references list; add the new R92 audit/contract/packet citations.
  2. Imports — add named imports for the three new adapters (createKimiCompletion, createCodexCompletion, createOpenAiCompletion) from ./adapters/*.js. Add named imports for recordFailure, recordSuccess, isOpen, resetIfElapsed, snapshot, resetCircuitBreaker, getCircuitBreakerState, CIRCUIT_FAILURE_THRESHOLD, CIRCUIT_COOLDOWN_MS from ./circuit.js. Add a type-only import of CircuitState from ./circuit.js.
  3. Phase 0 invariants list — replace I1–I13 with the I1–I19 list from contract §6.
  4. ROUTER_PHASE_0_SHAPE — flip literals to { members: 6, hasCircuitBreaker: true, modelsSupported: [...] } per contract §2.
  5. New error classes — add RouterTimeoutError, CircuitOpenError, NoAdapterError (definitions from contract §5).
  6. RouteOptions — add completionFnRegistry?: Partial<Record<ModelId, CompletionFn>> and nowFn?: () => number fields (test seams). Existing fields unchanged.
  7. Adapter registry — module-level Partial<Record<ModelId, CompletionFn>> per contract §4.3.
  8. projectUpstreamOptions — extend to accept a modelId arg. Conservative model-id passthrough per contract §4.4.
  9. defaultCompletionFn — narrow to “default adapter for Claude with tools” only. Renamed defaultClaudeWithToolsCompletion to avoid confusion.
  10. routeRequest — replace the body. Algorithm from contract §4.1.
  11. readTimeoutEnv + raceWithTimeout + orderedChain + defaultAdapterFor — new internal helpers.
  12. Re-exportsCIRCUIT_FAILURE_THRESHOLD, CIRCUIT_COOLDOWN_MS, getCircuitBreakerState, resetCircuitBreaker from ./circuit.js, plus the three new error types.

1.3. Modify src/domains/router/index.ts

Append three lines at the end, in alphabetical order:

export * from './adapters/codex.js';
export * from './adapters/kimi.js';
export * from './adapters/openai.js';

(Existing two export * lines unchanged.)

1.4. Create src/__tests__/domains/router/circuit.test.ts

Test plan:

  1. snapshot() is empty before any state mutation.
  2. recordFailure once → failures === 1, openedAt === null, isOpen === false.
  3. recordFailure three times → failures === 3, openedAt === now(), isOpen === true.
  4. recordFailure four+ times during open → openedAt unchanged (anchored at first trip).
  5. recordSuccess after two failures → failures === 0, openedAt === null, isOpen === false.
  6. recordSuccess after a trip → failures === 0, openedAt === <unchanged>, isOpen === true (success during open does not bypass cooldown).
  7. resetIfElapsed with now < openedAt + 60_000 → state unchanged.
  8. resetIfElapsed with now >= openedAt + 60_000 → state cleared.
  9. isOpen returns false when state.openedAt === null even if failures > 0.
  10. Per-model isolation: model A trip does not affect model B.
  11. resetCircuitBreaker('claude') clears one model; others untouched.
  12. resetCircuitBreaker() (no arg) clears all models.
  13. snapshot() returns a ReadonlyMap whose iterator yields frozen CircuitState values.
  14. nowFn injection determines all clock reads (verified by passing () => 0 then () => 60_000 and observing transitions).

1.5. Rewrite src/__tests__/domains/router/fallback.test.ts

Preserved tests (refactored only where the new export changes signatures):

  • routeRequest — happy path (4 tests)
  • routeRequest — scoring integration (2 tests) — still asserts scoring called once; the second test’s “default scoreIntent” already lands on claude via the empty-snapshot fallback.
  • routeRequest — upstream forwarding (5 tests)
  • routeRequest — failure wrapping (6 tests) — attempts.length === 1 assertion REPLACED with attempts.length >= 1 and a new assertion that the first attempt’s model matches the scoring winner.
  • FallbackChainExhaustedError — message format (3 tests)
  • routeRequest — non-Error thrown values (1 test)
  • routeRequest — tools passthrough (2 tests)
  • routeRequest — default dispatcher (2 tests)
  • routeRequest — determinism first test (preserved). Second test (“different prompts route to the same model”) REMOVED — Phase 1.5 chain order is scoring-sensitive.

Removed:

  • ROUTER_PHASE_0_SHAPE marker block (4 tests asserting Phase 0 literals).
  • routeRequest — ZERO cascade invariant block (2 tests asserting calls.length === 1).
  • One determinism test (see above).

Added:

  • ROUTER_PHASE_0_SHAPE — Phase 1.5 literals: 4 tests asserting the new literal values.
  • routeRequest — cascade: 4 tests (fail-then-succeed; first-success-stops-cascade; both-fail; per-model registry routing).
  • routeRequest — circuit breaker: 5 tests (trip after 3 fails; tripped model skipped on 4th call; time-bound reset; manual reset; per-model isolation).
  • routeRequest — timeout: 3 tests (default 30s; COLIBRI_MODEL_TIMEOUT override; timeout treated as failure → next adapter tried).
  • routeRequest — all-open: 1 test (every model tripped → FallbackChainExhaustedError with CircuitOpenError attempts).
  • routeRequest — observability: 2 tests (getCircuitBreakerState returns frozen snapshot; resetCircuitBreaker clears).
  • routeRequest — fold-in smoke: 1 test (the W3 adapters are importable via the barrel — import { createKimiCompletion } from '../../../domains/router/index.js' does not throw at module load).

Net change: ~+250 lines / -120 lines.

2. Implementation order

  1. Write src/domains/router/circuit.ts (greenfield; no deps).
  2. Write src/__tests__/domains/router/circuit.test.ts.
  3. Run npm test -- --testPathPattern='router/circuit' → expect all green.
  4. Rewrite src/domains/router/fallback.ts per §1.2.
  5. Rewrite src/__tests__/domains/router/fallback.test.ts per §1.5.
  6. Add fold-in lines to src/domains/router/index.ts.
  7. Run npm run build. Fix any type errors.
  8. Run npm run lint. Fix any lint warnings.
  9. Run full npm test. Investigate any flake.
  10. Commit Step 4 with single feat(p1-5-5-fallback-cb): N-member fallback + circuit breaker (real impl) + wave-3 fold-in re-exports.
  11. Write docs/verification/p1-5-5-fallback-cb-verification.md. Commit.

3. Notes for the implementation

3.1. Sorting Object.keys(scores) with stable tie-break

ECMAScript guarantees stable sort since 2019. The chain order:

const chainOrder = (Object.keys(decision.scores) as ModelId[]).sort((a, b) => {
  const sa = decision.scores[a] ?? 0;
  const sb = decision.scores[b] ?? 0;
  if (sa !== sb) return sb - sa;          // score descending
  return a < b ? -1 : a > b ? 1 : 0;       // ASCII ascending tie-break
});

The empty-snapshot path returns { claude: 1.0, ...rest: 0 } so the chain order is ['claude', 'claude-haiku-3-5', 'claude-sonnet-3-5', 'gpt-4o', 'gpt-4o-mini', 'kimi-k2', ...] (all the zero-scored models sorted ASCII-ascending, with 'claude' first because its score is 1.0).

3.2. Adapter registry initialisation

The registry is a const module-level value. The four adapter import statements force module evaluation at fallback.ts load — that is the standard ES module behavior. The adapters themselves do NOT touch process.env at module load (verified — they read env at call-time inside createCompletion / createKimiCompletion / etc.) so this is safe and does not cause side-effects.

3.3. Tools passthrough

The Phase 0 defaultCompletionFn(tools) function returned either createCompletion or createCompletionWithTools based on tools.length. The Phase 1.5 generalisation:

function defaultAdapterFor(modelId: ModelId, tools?: ReadonlyArray<AnthropicTool>): CompletionFn | undefined {
  if (tools && tools.length > 0) {
    if (modelId === 'claude' || modelId === 'claude-sonnet-3-5' || modelId === 'claude-haiku-3-5') {
      const toolsArr = tools as AnthropicTool[];
      return (prompt, opts) => createCompletionWithTools(prompt, toolsArr, opts);
    }
    // Non-Claude adapter with tools — tools dropped for P1.5.5 (logged warning).
  }
  return REGISTRY[modelId];
}

The warning log only fires when the chain actually attempts that model with tools (not at every call). It uses the logger from options.logger if present, else console.error.

3.4. completionFn legacy seam

The Phase 0 tests inject options.completionFn as a single global mock. We preserve that seam — when present, it is used for every chain member that does not have a per-model override. This means the Phase 0 tests that inject a single failing completionFn will now have completionFn called once per chain member (6× by default). The Phase 0 “ZERO cascade invariant” tests are deleted (they tested a Phase 0 invariant that no longer holds). Other Phase 0 tests with a single-success completionFn continue to pass because the first chain member returns success → loop exits.

3.5. CB state cleanup between tests

Tests MUST call resetCircuitBreaker() in beforeEach or afterEach. The fallback test file adds this hook at module-test scope. The circuit test file already controls state explicitly via its own assertions and is its own clean island.

3.6. Env var hygiene

Tests modifying process.env['COLIBRI_MODEL_TIMEOUT'] MUST restore it in afterEach (or use a per-test backup + restore). Standard Jest pattern with const original = process.env['COLIBRI_MODEL_TIMEOUT']; afterEach(() => { ... }).

3.7. Default dispatcher tests

The two “default dispatcher” tests in the existing fallback test file use a stub fetchFn that returns a canned Anthropic response. They exercise the registry path (no completionFn injected). Under Phase 1.5, the chain starts at 'claude' (winner of the empty-snapshot fallback), the Claude adapter is hit with fetchFn, returns success, and the loop exits — same as Phase 0. These tests need no change.

3.8. CB-isolated test runs

Because the CB state is a module singleton, parallel Jest workers could in principle share it. But Jest workers each run a fresh module context, so per-worker state is isolated. The risk is within a single worker, across test files. The CB module is imported by both circuit.test.ts and (transitively) fallback.test.ts. Each test file’s afterEach calls resetCircuitBreaker() to clear.

3.9. Lint compliance

The codebase uses strict TS + @typescript-eslint. Anticipated rule edges:

  • @typescript-eslint/no-explicit-any — none in new code.
  • @typescript-eslint/no-unsafe-* — handled via narrow type annotations on adapter registry, error normalize, and projectUpstreamOptions.
  • import/order — standard ordering: node-builtins → external → internal absolute → relative. The fallback module is pure-relative imports.
  • @typescript-eslint/consistent-type-imports — use import type { ... } for type-only imports (e.g. import type { CircuitState } from './circuit.js').

4. Test execution baseline

Baseline on 94ce7f8c (per dispatch): 3231 tests across the full suite. P1.5.5 should:

  • Delete 4 Phase 0 ROUTER_PHASE_0_SHAPE assertions (-4 tests).
  • Delete 2 ZERO-cascade-invariant tests (-2 tests).
  • Delete 1 “different prompts route to same model” determinism test (-1 test).
  • Add 14 new fallback tests (cascade × 4, CB × 5, timeout × 3, all-open × 1, observability × 2 — wait, that’s 15 actually; recount during Step 4) (+~15 tests).
  • Add 14 circuit tests (+14 tests).

Net: ~+22 tests. Expected post-P1.5.5: ~3253 tests passing.

The dispatch packet says “Some Phase 0 fallback assertions are EXPECTED to fail and be rewritten — verify the rewrites land in the same PR.” ✓ done in §1.5 above.

5. Risk-mitigation checks during Step 4

  • After deleting the ZERO-cascade tests, no Phase 1 behavior is silently regressed elsewhere. Grep \.toHaveLength(1) across test files → only intended tests still use it.
  • ROUTER_PHASE_0_SHAPE is not imported by any non-test file outside fallback.ts. Grep ROUTER_PHASE_0_SHAPE → only fallback.ts + fallback.test.ts.
  • No setTimeout outside raceWithTimeout.
  • No Date.now() outside circuit.ts (uses injectable nowFn) and readTimeoutEnv (clock not involved). The fallback chain reads the clock only via the CB module.
  • Adapter env vars not read at module load.
  • Re-export ordering in index.ts matches alphabetical and does not produce duplicate-symbol TS errors.
  • npm run build clean.
  • npm run lint clean.
  • npm test clean modulo known flakes (consensus parity, reputation tools — retry-clean per dispatch).

6. Packet close

Execution plan complete. Cleared to start Step 4 (Implement).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.