P1.5.8 — Audit (Step 1 of 5)
Round: R92 Wave 7 parallel slice 1/2. Sibling: P1.5.10 ζ integration (file-disjoint — modifies
src/domains/router/fallback.ts; this slice MUST NOT touch that file). Base:origin/main@6cfd269b(post-P1.5.7 #258 merge).
1. Scope statement
Inventory the four δ adapters (Claude, Kimi, Codex, OpenAI), their shared
CompletionResult contract, and the existing per-adapter test surface, in
order to identify which parity invariants are already covered per-adapter
and which require a new cross-model suite.
The deliverable of P1.5.8 is a single integration test file (or test
subdir) that exercises ALL four adapters through the same fixtures and
asserts structural equality of their CompletionResult shape across
equivalent mocked responses. That gives the router a load-bearing parity
proof: adapters are interchangeable at the CompletionFn boundary.
2. Adapters in scope (4)
| Adapter | Public entry points | Source file | API protocol | Test file |
|---|---|---|---|---|
| Claude | createCompletion · createCompletionWithTools |
src/domains/integrations/claude.ts |
Anthropic Messages | src/__tests__/domains/integrations/claude.test.ts |
| Kimi | createKimiCompletion · createKimiCompletionWithTools |
src/domains/router/adapters/kimi.ts |
OpenAI Chat Completions | src/__tests__/domains/router/adapters/kimi.test.ts |
| Codex | createCodexCompletion · createCodexCompletionWithTools |
src/domains/router/adapters/codex.ts |
OpenAI Chat Completions | src/__tests__/domains/router/adapters/codex.test.ts |
| OpenAI | createOpenAiCompletion · createOpenAiCompletionWithTools |
src/domains/router/adapters/openai.ts |
OpenAI Chat Completions | src/__tests__/domains/router/adapters/openai.test.ts |
All four return CompletionResult (re-exported from claude.ts:134-141):
interface CompletionResult {
readonly content: string;
readonly model: string;
readonly promptTokens: number;
readonly completionTokens: number;
readonly latencyMs: number;
readonly stopReason: string;
}
This shape is structurally identical across all four adapters by design
(P1.5.2 / P1.5.3 / P1.5.4 each cite “shape parity with claude.ts” as
an invariant — see kimi.ts:24-43, codex.ts:11-37, openai.ts:11-79). The
parity invariant is therefore declared by every adapter individually;
P1.5.8’s contribution is to verify it under one fixture set, so the
router’s CompletionFn boundary is provably interchangeable.
3. Shared types
Sourced from src/domains/integrations/claude.ts:
CompletionResult(line 134-141) — return shape; identical across all 4 adapters.AnthropicTool(line 95-99) — tool descriptor; structurally re-exported by every adapter (openai.ts:182-186aliases asOpenAiToolbut it is the same shape).
Per-adapter CompletionOptions shapes:
- Claude:
CompletionOptions— nobaseUrl. - Kimi:
KimiCompletionOptions— addsbaseUrl. - Codex:
CodexCompletionOptions— addsbaseUrl. - OpenAI:
OpenAiCompletionOptions— addsbaseUrl.
Each adapter accepts injectable seams: fetchFn, logger, delayFn,
apiKey. This is the injection point P1.5.8 will use.
4. Existing test coverage (per-adapter, pre-P1.5.8)
Each adapter has its own dedicated test file with hand-crafted fixtures:
| Adapter | Tests (file LoC) | Determinism | Tool-use | 401 error | 500 error | Retry/timeout |
|---|---|---|---|---|---|---|
| Claude | ~580 LoC | yes | yes | yes | yes | yes |
| Kimi | ~620 LoC | yes | yes | yes | yes | yes |
| Codex | ~590 LoC | yes | yes | yes | yes | yes |
| OpenAI | ~610 LoC | yes | yes | yes | yes | yes |
Gap: every adapter has its own assertion fixtures. There is no test
that runs the same fixture through all four and asserts the
CompletionResult shape is structurally equal. That gap is what P1.5.8
closes.
5. Router boundary tests (already exist)
src/__tests__/domains/router/fallback.test.ts— exercisesrouteRequestthroughcompletionFn/completionFnRegistryinjection. Each test injects aCompletionFnstub, not a real adapter. Confirms that the router contract is shape-agnostic but does NOT prove that the four real adapters all SATISFY the shape with equivalent mocked HTTP responses.src/__tests__/domains/router/tools.test.ts— MCP tool surface tests forrouter_score,router_call,router_fallback,router_stats.
6. Sibling-race constraint
P1.5.10 ζ integration runs in parallel and modifies:
src/domains/router/fallback.ts(to emit ζ trail events per router call)src/domains/router/tools.ts(to ensure the 4 MCP tools emit shape)
The slice override therefore forbids this slice from touching either of
those files. Parity tests may IMPORT from them (e.g. import routeRequest
from ../../../domains/router/fallback.js) but not edit them.
7. Pre-existing flakes (acknowledged, not in scope)
kimi.test.ts § injection seams › 7. latency measurement: 50ms delay → latencyMs >= 50— timer imprecision under load. The override states: optional fix, may leave with note.consensus/parity-harness.test.ts › G7.1— perf budget flake (10000 iterations < 5s). Pre-existing R89 Phase B issue; not in scope.server.test.ts › startup chain— pre-existing pre-R75 flake; not in scope.
8. What the parity suite must prove
For each of the 4 adapters, given a uniform set of mocked HTTP responses expressed in the adapter’s native wire shape:
- Shape parity — every successful
CompletionResulthas the 6 fields from §3 with the correct types. - Determinism — same fixture twice → structurally equal result
(excluding
latencyMswhich is wall-clock). - Token-accounting parity —
promptTokensandcompletionTokensare populated from the wire’s native location (Anthropicusage.input_tokens/usage.output_tokens; OpenAI/Kimi/Codexusage.prompt_tokens/usage.completion_tokens). - Stop-reason parity — Anthropic vocabulary on the result side
(
end_turn/tool_use/max_tokens/ etc.) regardless of the underlying wire vocabulary. - Tool-use mapping parity — when a tool-use response is mocked,
the result’s
contentis a JSON-stringified Anthropic-shapetool_use[]array. - Error mapping parity — 401 / 500 / network errors raise the
adapter-specific error class with the expected
codediscriminant. - Latency parity — every result has a finite non-negative
latencyMs. (Suite uses>= 0not>= delayto dodge the kimi flake.) - Injection-seam parity — every adapter accepts
fetchFn,logger,delayFn,apiKeyfrom the options bag (all four are tested via their existing per-adapter suites; the parity suite ASSUMES this and uses the four uniformly).
9. Out of scope
- ANY edit to
src/domains/router/fallback.ts(sibling P1.5.10 territory). - ANY edit to adapter source files (sibling Wave 3 territory; P1.5.2/3/4 closed).
- ζ Decision Trail recording (P1.5.10 scope).
- Real network calls (forbidden — all
fetchFninjected). - Multi-run flake-detection loop (the prompt’s “5 repeat runs” is a manual local check, not a test-loop construct).
- Wire-byte parity (different providers wrap tokens differently; we test structural parity, not byte parity).
10. Path forward
Per CLAUDE.md §6, next steps are:
- Step 2 (contract) — pin parity invariants precisely as acceptance criteria, mapping each to a planned test name.
- Step 3 (packet) — execution plan: test file layout, fixture registry, mocked-fetch helper, the 4-adapter × N-invariant matrix.
- Step 4 (implement) — write
src/__tests__/domains/router/parity.test.ts. - Step 5 (verify) — record test count delta + parity matrix.