P1.5.3 — Codex Adapter — Step 1 Audit
Round: R92, Wave 3 (parallel slice 2/3) — p1-5-3-codex-adapter
Branch: feature/p1-5-3-codex-adapter
Worktree: .worktrees/claude/p1-5-3-codex-adapter
Base SHA: 89adef66 (post-P1.5.1 feat(p1-5-1-scoring) merge — real scorer live)
Step: 1 of 5 (audit)
Author tier: T3 executor
Authoring agent: Claude Code (Opus 4.7 1M)
§1. Goal of P1.5.3
Ship a Codex completion adapter under src/domains/router/adapters/codex.ts
that mirrors the surface of the Phase 0 Claude wrapper
(src/domains/integrations/claude.ts) and ALSO mirrors the sibling Kimi
adapter (P1.5.2 — not yet merged at our base SHA; we proceed structurally
from the Claude reference per dispatch packet “mirror the Claude adapter;
consistency emerges from both sides mirroring the same reference”).
The adapter is library-only. No MCP tool is registered. Re-export
coordination through src/domains/router/index.ts is explicitly
forbidden for this slice (sibling parallel T3 executors race on that
file; a fold-in commit lands re-exports between Wave 3 and Wave 4).
The dispatch packet imposes a tighter writeback than the staging file:
REAL implementation, NO STUBS. Per T0 autonomous mandate, the adapter
issues a real Codex API call shape with full fetchFn injection — not a
placeholder that throws “not implemented”.
§2. Surfaces in scope
2.1 Files to create (this slice)
| Path | Role | Lines (est) |
|---|---|---|
src/domains/router/adapters/codex.ts |
Adapter module | ~400 |
src/__tests__/domains/router/adapters/codex.test.ts |
Parity tests | ~500 |
docs/audits/p1-5-3-codex-adapter-audit.md |
This file | — |
docs/contracts/p1-5-3-codex-adapter-contract.md |
Step 2 | — |
docs/packets/p1-5-3-codex-adapter-packet.md |
Step 3 | — |
docs/verification/p1-5-3-codex-adapter-verification.md |
Step 5 | — |
2.2 Files explicitly NOT in scope
| Path | Reason |
|---|---|
src/domains/router/index.ts |
Sibling parallel race (P1.5.2 Kimi + P1.5.4 OpenAI). Fold-in commit owns this. |
src/domains/router/scoring.ts |
δ scoring (P1.5.1 — already shipped). Adapter does not reach back. |
src/domains/router/fallback.ts |
δ fallback chain (P0.5.2). Adapter is a leaf; fallback wiring is P1.5.5+. |
src/domains/router/scoring-weights.ts |
κ-shim. Pure; nothing to do with adapters. |
src/domains/integrations/claude.ts |
Reference template only. Read, do not modify. |
src/config.ts |
COLIBRI_CODEX_* is call-time from process.env, not added to the Zod schema. Same pattern as ANTHROPIC_API_KEY. |
package.json |
No new dependency. Codex adapter uses global fetch. |
2.3 Files referenced (read-only inputs to the audit)
| Path | Role in audit |
|---|---|
src/domains/integrations/claude.ts |
The structural template. Lines 1–391 read in full. |
src/domains/router/scoring.ts |
Defines ModelId union — Codex is not a member (see §6 for the consequence). |
src/domains/router/fallback.ts |
Defines CompletionFn + CompletionFnOptions + RouteResult. The adapter’s exports must be assignable to CompletionFn. |
src/__tests__/domains/integrations/claude.test.ts |
Test patterns to mirror: makeMockFetch, makeSilentLogger, makeInstantDelay, baseOptions. |
src/config.ts |
Confirms ANTHROPIC_API_KEY is z.string().optional() — same call-time validation pattern Codex follows. |
§3. Phase 0 Claude adapter — surface inventory
The Claude adapter (src/domains/integrations/claude.ts) is the
structural template. Its public surface, captured by direct reading:
3.1 Exported public types
| Symbol | Kind | Notes |
|---|---|---|
AnthropicTool |
interface | { name: string; description: string; input_schema: Record<string, unknown> }. Reused by the Codex adapter unchanged — the router’s tool-shape is Anthropic’s. |
CompletionOptions |
interface | Optional model, maxTokens, systemPrompt, fetchFn, logger, delayFn, apiKey. The Codex adapter declares its own CodexCompletionOptions with the same fields (no inheritance — copy-paste-rename is the cited pattern). |
CompletionResult |
interface | { content, model, promptTokens, completionTokens, latencyMs, stopReason } — all readonly, all strings/numbers. Adapter MUST return this shape byte-identical so the router is provider-agnostic. |
3.2 Exported error classes
| Symbol | Code | Notes |
|---|---|---|
AnthropicConfigError |
'ANTHROPIC_CONFIG_ERROR' |
Thrown call-time when ANTHROPIC_API_KEY is absent. |
AnthropicApiError |
'ANTHROPIC_API_ERROR' | 'ANTHROPIC_RETRIES_EXHAUSTED' |
Thrown on terminal HTTP error or exhausted retries. Carries optional status. |
The Codex adapter ships parallel classes: CodexConfigError /
CodexApiError with codes CODEX_CONFIG_ERROR /
CODEX_API_ERROR / CODEX_RETRIES_EXHAUSTED.
3.3 Exported entry points
| Symbol | Sig | Notes |
|---|---|---|
createCompletion(prompt, options?) |
(string, CompletionOptions) → Promise<CompletionResult> |
Plain text completion. |
createCompletionWithTools(prompt, tools, options?) |
(string, AnthropicTool[], CompletionOptions) → Promise<CompletionResult> |
Tool-use completion; empty tools ⇒ degrades to plain. |
Codex adapter parallels: createCodexCompletion /
createCodexCompletionWithTools with identical positional signatures.
3.4 Internal constants
| Constant | Value | Codex equivalent |
|---|---|---|
ANTHROPIC_API_BASE |
'https://api.anthropic.com/v1' |
CODEX_API_BASE — see §4.1 |
ANTHROPIC_API_VERSION |
'2023-06-01' (header anthropic-version) |
Codex has no equivalent — header omitted |
DEFAULT_MAX_TOKENS |
1024 |
Same |
MAX_RETRIES |
3 |
Same |
BASE_DELAY_MS |
100 |
Same |
3.5 Internal helpers (will be re-implemented in codex.ts)
| Helper | Purpose | Reuse strategy |
|---|---|---|
isRetryable(status) |
True for 429 / 5xx | Copy verbatim — same HTTP semantics |
sleep(ms) |
Real setTimeout wrapper |
Copy verbatim |
buildRequestBody(...) |
Builds Anthropic Messages API JSON body | Replaced — Codex uses chat-completion shape (messages: [{role, content}] differs; tool_choice handling differs) |
parseResult(json, startMs) |
Extracts text or stringified tool_use from Anthropic content array | Replaced — Codex returns OpenAI-style choices[0].message.content + tool_calls |
attemptWithRetry(...) |
Core retry loop — same flow | Cloned with Codex-specific URL + headers + body builder + parser |
§4. Codex API — divergences from Anthropic Messages API
Codex (the OpenAI Codex API surface) follows the OpenAI Chat Completions
v1 protocol — the same wire-shape used by gpt-4o, gpt-4o-mini, and
o*-family models. The reference assumed in this audit is the public
OpenAI Chat Completions API as of the dispatch packet date (2026-05-13);
divergences listed below are the load-bearing differences the adapter
must compensate for.
4.1 Endpoint + auth
| Aspect | Anthropic | Codex | Adapter handling |
|---|---|---|---|
| Base URL | https://api.anthropic.com/v1 |
https://api.openai.com/v1 (default) |
COLIBRI_CODEX_BASE_URL env override; default constant CODEX_API_BASE |
| Endpoint path | /messages |
/chat/completions |
Adapter constant — CODEX_CHAT_COMPLETIONS_PATH = '/chat/completions' |
| Auth header | x-api-key: <key> |
Authorization: Bearer <key> |
Different header construction |
| Version header | anthropic-version: 2023-06-01 |
None | Omit |
| Content-Type | application/json |
application/json |
Same |
4.2 Request body
| Aspect | Anthropic (Messages) | Codex (Chat Completions) | Adapter handling |
|---|---|---|---|
| Model key | model |
model |
Same |
| Max-tokens key | max_tokens |
max_tokens (legacy) or max_completion_tokens (newer) |
Use max_tokens for compat with Codex / GPT-4o legacy chat path |
| System prompt | Top-level system: "..." |
First element of messages array with role: "system" |
Conditional message prepend (see §4.2.1) |
| Messages array | messages: [{role: 'user', content: '...'}] |
Same shape — messages: [{role: 'user', content: '...'}] |
Same skeleton; system role differs in placement |
| Tools array | tools: [{name, description, input_schema}] |
tools: [{type: 'function', function: {name, description, parameters}}] |
Translation required — AnthropicTool[] → OpenAI tool shape (see §4.4) |
| Tool choice | Implicit (model decides); explicit via tool_choice |
Same key tool_choice, different values ('auto' vs 'none' vs {type: 'function', function: {name: 'X'}}) |
Phase 1.5: omit — model decides. Could expand in Phase 2. |
| Stream | stream: false (default) |
stream: false (default) |
Both adapters non-streaming in Phase 1.5 |
4.2.1 System prompt placement
The Anthropic API takes a top-level system: "..." field. Codex takes
the system instruction as the first message in the messages array
({role: 'system', content: '...'}), with the user message following.
Both adapters omit the system field entirely when not provided.
4.3 Response body
| Aspect | Anthropic | Codex | Adapter handling |
|---|---|---|---|
| Content shape | content: [{type: 'text', text: '...'}] (array of blocks) |
choices: [{message: {role: 'assistant', content: '...'}, finish_reason: '...'}] |
Different extraction path |
| Token usage | usage: {input_tokens, output_tokens} |
usage: {prompt_tokens, completion_tokens, total_tokens} |
Key rename — prompt_tokens → promptTokens, completion_tokens → completionTokens |
| Model echo | model: 'claude-X-Y' |
model: 'gpt-X-Y' (echoed from request) |
Same — pass through |
| Stop reason | stop_reason: 'end_turn' \| 'tool_use' \| 'max_tokens' \| 'stop_sequence' |
choices[0].finish_reason: 'stop' \| 'tool_calls' \| 'length' \| 'content_filter' |
Normalisation required (see §4.5) |
4.4 Tool-use mapping
This is the single most divergent surface between the two adapters. The
router-facing contract is Anthropic-shape — the AnthropicTool[]
input and the response content-as-JSON-string-of-AnthropicTool-use-blocks
output. The Codex adapter is responsible for the bi-directional translation.
4.4.1 Tool declaration (input)
AnthropicTool OpenAI tool
───────────────────────────────────── ──────────────────────────────────
{ {
name: "get_weather", type: "function",
description: "...", function: {
input_schema: { name: "get_weather",
type: "object", description: "...",
properties: { ... }, parameters: {
required: ["location"] type: "object",
} properties: { ... },
} required: ["location"]
}
}
}
The translation is mechanical:
AnthropicTool.name→tool.function.nameAnthropicTool.description→tool.function.descriptionAnthropicTool.input_schema→tool.function.parameterstype: 'function'is added as a literal constant- The whole tool moves inside a
function: { ... }sub-object
4.4.2 Tool call (output)
Anthropic content block OpenAI tool_calls element
─────────────────────────────────────── ─────────────────────────────────────────
{ {
type: "tool_use", id: "call_001",
id: "tool_001", type: "function",
name: "get_weather", function: {
input: { location: "London" } name: "get_weather",
} arguments: "{\"location\":\"London\"}"
}
}
Key differences:
- Codex puts tool calls in
choices[0].message.tool_calls[], not in thecontentarray (Codex’scontentisnullwhen tools are used). - Codex’s
function.argumentsis a JSON string, not a parsed object. The adapter parses it before re-projecting to Anthropic shape. - Codex tool IDs are prefixed
call_; Anthropic tool IDs are prefixedtoolu_or vary. The adapter preserves the Codex ID verbatim — IDs are opaque to downstream callers.
4.4.3 Response content normalisation
The Claude adapter’s parseResult extracts content[0].text for text
responses and stringifies the whole content array for tool-use
responses. The Codex adapter mirrors this contract by synthesising an
Anthropic-shape content array when Codex returns tool_calls:
const anthropicShapeContent = toolCalls.map((c) => ({
type: 'tool_use',
id: c.id,
name: c.function.name,
input: JSON.parse(c.function.arguments),
}));
return JSON.stringify(anthropicShapeContent);
This keeps the CompletionResult.content field byte-shape-identical to
what the Claude adapter returns for a tool-use response.
4.5 Finish-reason normalisation
The CompletionResult.stopReason field uses the Anthropic vocabulary
('end_turn', 'tool_use', 'max_tokens', 'stop_sequence'). Codex
returns OpenAI vocabulary; we normalise:
Codex finish_reason |
Normalised stopReason |
|---|---|
'stop' |
'end_turn' |
'tool_calls' |
'tool_use' |
'length' |
'max_tokens' |
'content_filter' |
'content_filter' (passed through — no Anthropic equivalent) |
'function_call' (deprecated) |
'tool_use' |
absent / null |
'unknown' |
Per the dispatch packet “Document any Codex-vs-Anthropic API divergences in the contract doc + tool-use mapping table.” — this table is reproduced in the contract doc §4.
4.6 Error response shape
| Aspect | Anthropic | Codex | Adapter handling |
|---|---|---|---|
| 4xx body | {type: 'error', error: {type, message}} |
{error: {message, type, param, code}} |
The adapter does NOT introspect either body — both throw with HTTP status only. Phase 1.5 keeps error parity at the HTTP-status level (4xx terminal, 429 retry, 5xx retry). |
| Rate-limit signalling | 429 Too Many Requests |
429 Too Many Requests |
Same — isRetryable returns true. |
§5. Environment variables
The adapter reads exactly two env vars at call-time:
| Variable | Required | Default | Purpose |
|---|---|---|---|
COLIBRI_CODEX_API_KEY |
call-time required | none | Bearer token for Authorization header. If absent, throws CodexConfigError. |
COLIBRI_CODEX_BASE_URL |
optional | 'https://api.openai.com/v1' (constant) |
Override for self-hosted or alternative gateways. |
Both variables follow the call-time-validate pattern (CLAUDE.md §T0
decision 3), matching how ANTHROPIC_API_KEY is consumed by
createCompletion. Neither is added to src/config.ts’s Zod schema —
the Phase 0 schema floor stays minimal, and the adapter reads directly
from process.env (or accepts an injected apiKey / baseUrl override
in CodexCompletionOptions for tests).
The AMS_* namespace is forbidden by assertNoDonorNamespace in
config.ts; the adapter never touches AMS_* keys.
§6. ModelId membership
The δ ModelId union (src/domains/router/scoring.ts:110-119) does
not currently include 'codex'. The 9 members are:
'claude' | 'claude-sonnet-3-5' | 'claude-haiku-3-5' | 'gpt-4o' |
'gpt-4o-mini' | 'gemini-1-5-pro' | 'llama-3-3-70b' | 'mixtral-8x22b' |
'kimi-k2'
Consequence: the Codex adapter does NOT take a ModelId parameter.
It is structurally a CompletionFn-compatible callable that returns a
CompletionResult. The router’s ModelId-keyed dispatch (which lands
later, in P1.5.5) will need a way to associate a ModelId value with the
Codex adapter; that mapping is out of scope for P1.5.3.
For Phase 1.5 staging, we treat Codex as the OpenAI Codex API surface
— same wire protocol as OpenAI Chat Completions — and accept that
extending ModelId to include 'codex' is a future hygiene PR
(P1.5.4 OpenAI may or may not subsume Codex; that decision is also out
of scope here).
The adapter’s default model id (when options.model is absent) is
'gpt-4o-mini' — the lowest-cost member of the Codex/OpenAI cohort per
the δ candidate cohort table (docs/3-world/social/llm.md §candidate
cohort). The adapter does NOT hardcode this as const CODEX_VERSION;
it falls back to options.model ?? DEFAULT_CODEX_MODEL.
Note on default selection: Production callers should pass
options.model explicitly. The default is a safety net to prevent the
adapter from issuing a request with a missing model field (which Codex
would reject with HTTP 400).
§7. Injection seams
The adapter mirrors the Claude adapter’s three injection seams plus an additional URL-override:
| Seam | Purpose | Default |
|---|---|---|
fetchFn |
Mock HTTP transport in tests | Global fetch (Node ≥ 20) |
logger |
Stderr-only logger | console.error |
delayFn |
Sleep override for deterministic retry tests | setTimeout-backed sleep |
apiKey |
Override process.env.COLIBRI_CODEX_API_KEY |
process.env.COLIBRI_CODEX_API_KEY |
baseUrl |
Override process.env.COLIBRI_CODEX_BASE_URL |
process.env.COLIBRI_CODEX_BASE_URL ?? CODEX_API_BASE |
The baseUrl seam is new (not present in the Claude adapter).
Rationale: the Claude API base URL is effectively a constant (Anthropic
runs no alternate gateway), but Codex’s OpenAI-protocol surface is
served by many compatible gateways (Azure OpenAI, vLLM, llama.cpp,
LM Studio, Ollama with /v1/chat/completions shim, etc.). Test parity
also benefits — tests can point at a localhost mock without monkey-
patching the fetchFn.
§8. Logging
Per CLAUDE.md §9 + integrations §3 — stderr only. Never process.stdout
(donor bug mitigation — StdioServerTransport owns stdout). Log line
format mirrors the Claude adapter:
[codex] model=<resolved-model> prompt_tokens=N completion_tokens=N latency_ms=N
The prefix [codex] (lower-case, in square brackets) distinguishes
Codex log lines from [claude] lines so a multi-provider log file
remains greppable.
§9. Retry policy
Identical to the Claude adapter:
- Retry on 429 + 5xx (
isRetryable) - Max 3 retries (4 total attempts)
- Exponential backoff: 100 ms → 200 ms → 400 ms
- On retries exhausted: throw
CodexApiErrorwith codeCODEX_RETRIES_EXHAUSTED - On network-level error (DNS fail, ECONNREFUSED): throw
CodexApiErrorwithstatus: undefinedand codeCODEX_API_ERROR
No deviation from the Claude policy is justified — the adapter pair must behave identically under load.
§10. Test plan (forward-look — fully spec’d in Step 3 packet)
Test file: src/__tests__/domains/router/adapters/codex.test.ts
Parallel to src/__tests__/domains/integrations/claude.test.ts —
re-pointed at the Codex endpoint with OpenAI-shape response bodies.
Test groups:
createCodexCompletion— success path (3–4 tests)- Returns
CompletionResultwith correct fields - POSTs to
/chat/completions - Includes
Authorization: Bearer <key>header (no Anthropic-version) - Request body shape: model + max_tokens + messages
- System prompt placement: prepended to messages array
- Returns
createCodexCompletionWithTools— success path (2–3 tests)- Translates
AnthropicTool[]to OpenAI tool shape - Empty tools array omits
toolskey - Tool-use response translated back to Anthropic content shape
- Translates
- API key validation (2 tests)
- Missing
COLIBRI_CODEX_API_KEY⇒CodexConfigError - Injected
apiKeyoverridesprocess.env
- Missing
- Retry logic (1–2 tests)
- 429 retry with exponential backoff
- Finish-reason normalisation (1 test, table-driven)
- All 5 mappings table-tested
- Tool-use mapping (1 test)
- Codex
tool_callsprojected to Anthropic content shape;function.arguments(JSON string) is parsed
- Codex
Total: 10–13 tests. Target test budget aligns with the dispatch packet “5–10 parity tests” — we exceed the upper bound deliberately because the tool-use mapping requires its own coverage layer that the Claude adapter does not need.
§11. Acceptance criteria (from dispatch packet, restated)
createCodexCompletion(prompt, options) → Promise<CompletionResult>matches Claude shapecreateCodexCompletionWithTools(prompt, tools, options) → Promise<CompletionResult>matches Claude shape- Reads
COLIBRI_CODEX_API_KEYat call-time (not import-time) - Reads
COLIBRI_CODEX_BASE_URLwith default to OpenAI Chat Completions URL - Translates Codex
tool_callsresponse into Anthropic-SDK tool-shape - Injection seams
fetchFn,logger,delayFnpresent (+apiKey,baseUrl) CodexApiError+CodexConfigErrorextendErrorwith shape parity toAnthropicApiError/AnthropicConfigError- 5–10 parity tests (this slice: 10–13)
- No MCP tool registration
- No mutation of
src/domains/router/index.ts(CRITICAL OVERRIDE) npm run build && npm run lint && npm testgreen- Zero regression vs main
89adef66(3153 tests baseline)
§12. Risk register
| Risk | Severity | Mitigation |
|---|---|---|
Sibling parallel race on index.ts |
HIGH | Adapter file is the only src/domains/router/ change. Re-export deferred to fold-in commit per dispatch packet CRITICAL OVERRIDE. |
process.env.COLIBRI_CODEX_API_KEY leakage in test env |
LOW | Tests inject apiKey explicitly; never rely on real env var. Test fixture uses 'sk-codex-test-fake-key'. |
| Codex base URL leakage to production | LOW | baseUrl override is opt-in; default constant is the public OpenAI endpoint. |
JSON.parse on Codex tool args raising |
MEDIUM | Adapter catches parse error, leaves arguments as the raw string (input: <string> instead of <object>). Logged at WARN. Documented behaviour in contract §4.4.3. |
| Network error tests flaky on Windows | LOW | Tests inject fetchFn for all network paths. No real fetch ever fires in the test suite. |
ModelId lacks 'codex' |
LOW | Adapter is ModelId-free at this layer. Router wiring is P1.5.5+ — not in scope. |
enabled = 0 in candidate seed for OpenAI/Codex rows |
LOW | The adapter is callable regardless of seed flag; the router chooses whether to call it. Not the adapter’s concern. |
| Test count baseline drift (3153 → 3153 + Δ) | LOW | Expected. New tests add to the suite; no existing tests should fail. |
§13. Tool-use mapping table (will be repeated verbatim in the contract doc)
| Direction | Field on Anthropic side | Field on Codex side | Conversion |
|---|---|---|---|
| Req: tool def | tools[i] flat |
tools[i].function nested |
Wrap inside {type:'function', function:{...}} |
| Req: tool def | tools[i].name |
tools[i].function.name |
Move nested |
| Req: tool def | tools[i].description |
tools[i].function.description |
Move nested |
| Req: tool def | tools[i].input_schema |
tools[i].function.parameters |
Move + rename |
| Req: system prompt | top-level system key |
first message {role:'system'} |
Restructure |
| Resp: text content | content[i].text (where type:'text') |
choices[0].message.content |
Direct extract |
| Resp: tool call | content[i] (where type:'tool_use') |
choices[0].message.tool_calls[j] |
Different array location |
| Resp: tool call id | content[i].id |
tool_calls[j].id |
Pass-through (Codex prefix preserved) |
| Resp: tool call name | content[i].name |
tool_calls[j].function.name |
Un-nest |
| Resp: tool call args | content[i].input (object) |
tool_calls[j].function.arguments (JSON string) |
JSON.parse |
| Resp: finish reason | stop_reason |
choices[0].finish_reason |
Table §4.5 |
| Resp: prompt tokens | usage.input_tokens |
usage.prompt_tokens |
Rename |
| Resp: completion tokens | usage.output_tokens |
usage.completion_tokens |
Rename |
§14. Step 1 exit gate
This audit is complete. Findings:
- The Codex adapter’s surface is fully derivable from the Claude reference, modulo the OpenAI-vs-Anthropic wire-shape divergences documented in §4 + §13.
- No imports outside
process.envreads are required. - No new runtime dependency.
- No
src/domains/router/index.tsmutation. - No MCP tool registration.
Step 2 (contract) may proceed.
Commit message (per CLAUDE.md §6 template):
audit(p1-5-3-codex-adapter): inventory adapter surface + Codex API divergences