P1.5.8 — Execution Packet (Step 3 of 5)

Bridges contract → implementation. Pins fixture shapes, file layout, and per-adapter wire bodies. Step 4 implements; Step 5 verifies.

1. File layout (final)

src/__tests__/domains/router/
├── parity.test.ts          (NEW — driven by DRIVERS array)
└── parity-helpers.ts       (NEW — fixtures + driver implementations)

No edits to existing files. No changes to production code.

2. parity-helpers.ts — module sketch

// SECTION A — imports
import {
  createCompletion,
  createCompletionWithTools,
  AnthropicApiError,
  AnthropicConfigError,
  type AnthropicTool,
  type CompletionOptions,
  type CompletionResult,
} from '../../../domains/integrations/claude.js';
import {
  createKimiCompletion,
  createKimiCompletionWithTools,
  KimiApiError,
  KimiConfigError,
  type KimiCompletionOptions,
} from '../../../domains/router/adapters/kimi.js';
import {
  createCodexCompletion,
  createCodexCompletionWithTools,
  CodexApiError,
  CodexConfigError,
  type CodexCompletionOptions,
} from '../../../domains/router/adapters/codex.js';
import {
  createOpenAiCompletion,
  createOpenAiCompletionWithTools,
  OpenAiApiError,
  OpenAiConfigError,
  type OpenAiCompletionOptions,
} from '../../../domains/router/adapters/openai.js';

// SECTION B — shared constants
export const FAKE_PROMPT = 'Say hello.';
export const FAKE_API_KEY = 'sk-parity-test-fake-key';
export const FAKE_MODEL = 'parity-test-model';

export const SAMPLE_TOOL: AnthropicTool = { ... };

// SECTION C — mock fetch (extends the per-adapter pattern)
export type FetchCall = [string | URL | Request, RequestInit | undefined];
export function makeMockFetch(responses: Array<{
  ok: boolean;
  status: number;
  body?: unknown;
  throwErr?: Error;
}>): { fetchFn: typeof fetch; calls: FetchCall[] };

// SECTION D — silent logger
export function makeSilentLogger(): {
  logger: (...args: unknown[]) => void;
  lines: string[];
};

// SECTION E — instant delay
export function makeInstantDelay(): {
  delayFn: (ms: number) => Promise<void>;
  calls: number[];
};

// SECTION F — driver interface
export type AdapterName = 'claude' | 'kimi' | 'codex' | 'openai';
export interface ParityCallSeams {
  apiKey: string;
  model: string;
  fetchFn: typeof fetch;
  logger: (...args: unknown[]) => void;
  delayFn: (ms: number) => Promise<void>;
}
export interface ParityDriver {
  readonly name: AdapterName;
  callPlain(prompt: string, seams: ParityCallSeams): Promise<CompletionResult>;
  callWithTools(prompt: string, tools: AnthropicTool[], seams: ParityCallSeams): Promise<CompletionResult>;
  callMissingApiKey(prompt: string, fetchFn: typeof fetch): Promise<CompletionResult>;
  readonly errorClass: ErrorConstructor;
  readonly configErrorClass: ErrorConstructor;
  readonly apiErrorCode: string;
  readonly retriesExhaustedCode: string;
  readonly configErrorCode: string;
  makeSuccessResponseBody(): unknown;
  makeToolUseResponseBody(): unknown;
  makeMissingUsageResponseBody(): unknown;
  makeMultiToolResponseBody(): unknown;
  successStopReason(): string;     // what success result should produce
  toolUseStopReason(): string;     // what tool_use result should produce
  expectedStopReasonForLength(): string; // 'max_tokens' or 'length' depending on adapter
}

// SECTION G — driver implementations (4 factories)
export function makeClaudeDriver(): ParityDriver;
export function makeKimiDriver(): ParityDriver;
export function makeCodexDriver(): ParityDriver;
export function makeOpenAiDriver(): ParityDriver;

// SECTION H — frozen array of drivers
export const DRIVERS: ReadonlyArray<ParityDriver>;

3. Per-adapter wire fixtures

3.1 Claude success body (Anthropic Messages API)

{
  "id": "msg_parity_001",
  "type": "message",
  "role": "assistant",
  "model": "parity-test-model",
  "content": [{ "type": "text", "text": "Hello!" }],
  "stop_reason": "end_turn",
  "usage": { "input_tokens": 10, "output_tokens": 5 }
}

3.2 Kimi success body (OpenAI Chat Completions, Anthropic stop_reason mapping)

{
  "id": "cmpl_parity_001",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "parity-test-model",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Hello!" },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15 }
}

3.3 Codex success body (identical wire shape to Kimi)

Same structure as Kimi. Codex normalizes finish_reason: 'stop' to 'end_turn' via normalizeFinishReason (codex.ts:208-224).

3.4 OpenAI success body (identical wire shape to Kimi)

Same structure as Kimi. OpenAI passes finish_reason through verbatim (openai.ts:441-443). So OpenAI’s stopReason for a ‘stop’ fixture is 'stop', not 'end_turn'. This is a documented divergence in P4.2 — the parity suite asserts the type (string) is uniform but allows the value to diverge per the contract.

3.5 Per-adapter tool-use response

All four follow the OpenAI / Anthropic vocabulary native to their wire. Each driver’s makeToolUseResponseBody() returns the adapter-native shape; the parity assertion then inspects the parsed CompletionResult.content (JSON-stringified Anthropic-shape tool_use[]).

3.6 Per-adapter error fixtures

makeMockFetch([{ ok: false, status: 401, body: {} }]) for 401 path. makeMockFetch([{ ok: false, status: 500 } x 4 ]) for retries-exhausted path. makeMockFetch([{ throwErr: new TypeError('network down') }]) for net path.

4. parity.test.ts — top-level structure

import { describe, test, expect } from '@jest/globals';
import { DRIVERS, ... } from './parity-helpers.js';

describe.each(DRIVERS)('δ parity — $name adapter', (driver) => {
  describe('P1 — shape parity', () => { ... 3 tests ... });
  describe('P2 — determinism', () => { ... 1 test ... });
  describe('P3 — token accounting', () => { ... 3 tests ... });
  describe('P4 — stop-reason mapping', () => { ... 3 tests ... });
  describe('P5 — tool-use mapping', () => { ... 4 tests ... });
  describe('P6 — error mapping', () => { ... 5 tests ... });
  describe('P7 — latency', () => { ... 3 tests ... });
  describe('P8 — injection seams', () => { ... 4 tests ... });
});

describe('δ parity — cross-cutting (all 4 adapters)', () => {
  test('C1 all 4 adapters return structurally equal CompletionResult shape on success', ...);
  test('C2 all 4 adapters yield identical token counts given equivalent mocked usage', ...);
  test('C3 all 4 adapters return string-typed stopReason given native success fixture', ...);
  test('C4 all 4 adapters emit JSON-stringified tool_use[] on tool-use response', ...);
});

5. Test naming pattern

Each driver-parity test name is prefixed P<n>.<m> — <description> to match the contract invariant labels.

Cross-cutting tests use C<n> prefix.

6. Expected test count delta

Per-driver tests (driver block × test count):

  • P1: 3
  • P2: 1
  • P3: 3
  • P4: 3
  • P5: 4
  • P6: 5
  • P7: 3
  • P8: 4

Total per driver = 26. Four drivers → 104 driver-parity tests.

Plus 4 cross-cutting tests.

Expected delta: +108. Final count estimate: 3353 + 108 = 3461. Step 5 records the actual delta.

7. ESLint considerations

  • The helper file declares an unused-vars-safe void consumption of type imports where the test file does not name them.
  • Each driver function returns a frozen object (Object.freeze({...})).
  • All async test bodies await every driver call to avoid open-handles.
  • delayFn is the only way these tests sleep; jest.useFakeTimers() is NOT used (the contract states injection-based determinism, not fake timers — matches the existing per-adapter test pattern).

8. Build + lint expectations

  • TypeScript: strict mode, exactOptionalPropertyTypes: true. The helper file therefore omits seams from options bags rather than assigning them undefined.
  • ESLint: eslint src covers src/__tests__/. The no-restricted-globals rule includes fetch, but mock fetch is local-scoped and never reads the global, so this is clean by construction.

9. Kimi latency flake — decision

Per slice override: leave the existing kimi.test.ts § 7 flake unfixed (out-of-slice editing). The parity suite is the right successor — it asserts latencyMs >= 0 and Number.isFinite(latencyMs), NOT a lower-bound delay assertion. Documented in §10 of the verification doc.

10. Implementation order

  1. Create parity-helpers.ts — shared constants, mock fetch, drivers.
  2. Create parity.test.ts — driver-parity blocks + cross-cutting.
  3. Run npm run build to catch type errors.
  4. Run npm run lint to catch style/correctness.
  5. Run npm test to validate all parity tests + no regression on the 3353 baseline.

11. Forbiddens recap

  • src/domains/router/fallback.ts — sibling P1.5.10 race.
  • ✗ Any adapter source file.
  • ✗ Any production code under src/domains/router/ or src/domains/integrations/.
  • ✗ Real network calls.
  • ✗ Reading AMS_* env vars.
  • ✗ MCP tool registration.
  • ✗ ζ integration (P1.5.10 scope).
  • ✗ Using jest.useFakeTimers() (use injected delayFn instead).

12. Gate approval

Step 4 (implement) is authorized when this packet is committed. Per CLAUDE.md §6 the gate rule is “the packet gates implementation”.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.