P1.5.3 — Codex Adapter — Step 1 Audit

Round: R92, Wave 3 (parallel slice 2/3) — p1-5-3-codex-adapter Branch: feature/p1-5-3-codex-adapter Worktree: .worktrees/claude/p1-5-3-codex-adapter Base SHA: 89adef66 (post-P1.5.1 feat(p1-5-1-scoring) merge — real scorer live) Step: 1 of 5 (audit) Author tier: T3 executor Authoring agent: Claude Code (Opus 4.7 1M)


§1. Goal of P1.5.3

Ship a Codex completion adapter under src/domains/router/adapters/codex.ts that mirrors the surface of the Phase 0 Claude wrapper (src/domains/integrations/claude.ts) and ALSO mirrors the sibling Kimi adapter (P1.5.2 — not yet merged at our base SHA; we proceed structurally from the Claude reference per dispatch packet “mirror the Claude adapter; consistency emerges from both sides mirroring the same reference”).

The adapter is library-only. No MCP tool is registered. Re-export coordination through src/domains/router/index.ts is explicitly forbidden for this slice (sibling parallel T3 executors race on that file; a fold-in commit lands re-exports between Wave 3 and Wave 4).

The dispatch packet imposes a tighter writeback than the staging file: REAL implementation, NO STUBS. Per T0 autonomous mandate, the adapter issues a real Codex API call shape with full fetchFn injection — not a placeholder that throws “not implemented”.


§2. Surfaces in scope

2.1 Files to create (this slice)

Path Role Lines (est)
src/domains/router/adapters/codex.ts Adapter module ~400
src/__tests__/domains/router/adapters/codex.test.ts Parity tests ~500
docs/audits/p1-5-3-codex-adapter-audit.md This file
docs/contracts/p1-5-3-codex-adapter-contract.md Step 2
docs/packets/p1-5-3-codex-adapter-packet.md Step 3
docs/verification/p1-5-3-codex-adapter-verification.md Step 5

2.2 Files explicitly NOT in scope

Path Reason
src/domains/router/index.ts Sibling parallel race (P1.5.2 Kimi + P1.5.4 OpenAI). Fold-in commit owns this.
src/domains/router/scoring.ts δ scoring (P1.5.1 — already shipped). Adapter does not reach back.
src/domains/router/fallback.ts δ fallback chain (P0.5.2). Adapter is a leaf; fallback wiring is P1.5.5+.
src/domains/router/scoring-weights.ts κ-shim. Pure; nothing to do with adapters.
src/domains/integrations/claude.ts Reference template only. Read, do not modify.
src/config.ts COLIBRI_CODEX_* is call-time from process.env, not added to the Zod schema. Same pattern as ANTHROPIC_API_KEY.
package.json No new dependency. Codex adapter uses global fetch.

2.3 Files referenced (read-only inputs to the audit)

Path Role in audit
src/domains/integrations/claude.ts The structural template. Lines 1–391 read in full.
src/domains/router/scoring.ts Defines ModelId union — Codex is not a member (see §6 for the consequence).
src/domains/router/fallback.ts Defines CompletionFn + CompletionFnOptions + RouteResult. The adapter’s exports must be assignable to CompletionFn.
src/__tests__/domains/integrations/claude.test.ts Test patterns to mirror: makeMockFetch, makeSilentLogger, makeInstantDelay, baseOptions.
src/config.ts Confirms ANTHROPIC_API_KEY is z.string().optional() — same call-time validation pattern Codex follows.

§3. Phase 0 Claude adapter — surface inventory

The Claude adapter (src/domains/integrations/claude.ts) is the structural template. Its public surface, captured by direct reading:

3.1 Exported public types

Symbol Kind Notes
AnthropicTool interface { name: string; description: string; input_schema: Record<string, unknown> }. Reused by the Codex adapter unchanged — the router’s tool-shape is Anthropic’s.
CompletionOptions interface Optional model, maxTokens, systemPrompt, fetchFn, logger, delayFn, apiKey. The Codex adapter declares its own CodexCompletionOptions with the same fields (no inheritance — copy-paste-rename is the cited pattern).
CompletionResult interface { content, model, promptTokens, completionTokens, latencyMs, stopReason }all readonly, all strings/numbers. Adapter MUST return this shape byte-identical so the router is provider-agnostic.

3.2 Exported error classes

Symbol Code Notes
AnthropicConfigError 'ANTHROPIC_CONFIG_ERROR' Thrown call-time when ANTHROPIC_API_KEY is absent.
AnthropicApiError 'ANTHROPIC_API_ERROR' | 'ANTHROPIC_RETRIES_EXHAUSTED' Thrown on terminal HTTP error or exhausted retries. Carries optional status.

The Codex adapter ships parallel classes: CodexConfigError / CodexApiError with codes CODEX_CONFIG_ERROR / CODEX_API_ERROR / CODEX_RETRIES_EXHAUSTED.

3.3 Exported entry points

Symbol Sig Notes
createCompletion(prompt, options?) (string, CompletionOptions) → Promise<CompletionResult> Plain text completion.
createCompletionWithTools(prompt, tools, options?) (string, AnthropicTool[], CompletionOptions) → Promise<CompletionResult> Tool-use completion; empty tools ⇒ degrades to plain.

Codex adapter parallels: createCodexCompletion / createCodexCompletionWithTools with identical positional signatures.

3.4 Internal constants

Constant Value Codex equivalent
ANTHROPIC_API_BASE 'https://api.anthropic.com/v1' CODEX_API_BASE — see §4.1
ANTHROPIC_API_VERSION '2023-06-01' (header anthropic-version) Codex has no equivalent — header omitted
DEFAULT_MAX_TOKENS 1024 Same
MAX_RETRIES 3 Same
BASE_DELAY_MS 100 Same

3.5 Internal helpers (will be re-implemented in codex.ts)

Helper Purpose Reuse strategy
isRetryable(status) True for 429 / 5xx Copy verbatim — same HTTP semantics
sleep(ms) Real setTimeout wrapper Copy verbatim
buildRequestBody(...) Builds Anthropic Messages API JSON body Replaced — Codex uses chat-completion shape (messages: [{role, content}] differs; tool_choice handling differs)
parseResult(json, startMs) Extracts text or stringified tool_use from Anthropic content array Replaced — Codex returns OpenAI-style choices[0].message.content + tool_calls
attemptWithRetry(...) Core retry loop — same flow Cloned with Codex-specific URL + headers + body builder + parser

§4. Codex API — divergences from Anthropic Messages API

Codex (the OpenAI Codex API surface) follows the OpenAI Chat Completions v1 protocol — the same wire-shape used by gpt-4o, gpt-4o-mini, and o*-family models. The reference assumed in this audit is the public OpenAI Chat Completions API as of the dispatch packet date (2026-05-13); divergences listed below are the load-bearing differences the adapter must compensate for.

4.1 Endpoint + auth

Aspect Anthropic Codex Adapter handling
Base URL https://api.anthropic.com/v1 https://api.openai.com/v1 (default) COLIBRI_CODEX_BASE_URL env override; default constant CODEX_API_BASE
Endpoint path /messages /chat/completions Adapter constant — CODEX_CHAT_COMPLETIONS_PATH = '/chat/completions'
Auth header x-api-key: <key> Authorization: Bearer <key> Different header construction
Version header anthropic-version: 2023-06-01 None Omit
Content-Type application/json application/json Same

4.2 Request body

Aspect Anthropic (Messages) Codex (Chat Completions) Adapter handling
Model key model model Same
Max-tokens key max_tokens max_tokens (legacy) or max_completion_tokens (newer) Use max_tokens for compat with Codex / GPT-4o legacy chat path
System prompt Top-level system: "..." First element of messages array with role: "system" Conditional message prepend (see §4.2.1)
Messages array messages: [{role: 'user', content: '...'}] Same shape — messages: [{role: 'user', content: '...'}] Same skeleton; system role differs in placement
Tools array tools: [{name, description, input_schema}] tools: [{type: 'function', function: {name, description, parameters}}] Translation requiredAnthropicTool[] → OpenAI tool shape (see §4.4)
Tool choice Implicit (model decides); explicit via tool_choice Same key tool_choice, different values ('auto' vs 'none' vs {type: 'function', function: {name: 'X'}}) Phase 1.5: omit — model decides. Could expand in Phase 2.
Stream stream: false (default) stream: false (default) Both adapters non-streaming in Phase 1.5

4.2.1 System prompt placement

The Anthropic API takes a top-level system: "..." field. Codex takes the system instruction as the first message in the messages array ({role: 'system', content: '...'}), with the user message following.

Both adapters omit the system field entirely when not provided.

4.3 Response body

Aspect Anthropic Codex Adapter handling
Content shape content: [{type: 'text', text: '...'}] (array of blocks) choices: [{message: {role: 'assistant', content: '...'}, finish_reason: '...'}] Different extraction path
Token usage usage: {input_tokens, output_tokens} usage: {prompt_tokens, completion_tokens, total_tokens} Key rename — prompt_tokenspromptTokens, completion_tokenscompletionTokens
Model echo model: 'claude-X-Y' model: 'gpt-X-Y' (echoed from request) Same — pass through
Stop reason stop_reason: 'end_turn' \| 'tool_use' \| 'max_tokens' \| 'stop_sequence' choices[0].finish_reason: 'stop' \| 'tool_calls' \| 'length' \| 'content_filter' Normalisation required (see §4.5)

4.4 Tool-use mapping

This is the single most divergent surface between the two adapters. The router-facing contract is Anthropic-shape — the AnthropicTool[] input and the response content-as-JSON-string-of-AnthropicTool-use-blocks output. The Codex adapter is responsible for the bi-directional translation.

4.4.1 Tool declaration (input)

AnthropicTool                                  OpenAI tool
─────────────────────────────────────         ──────────────────────────────────
{                                              {
  name: "get_weather",                           type: "function",
  description: "...",                            function: {
  input_schema: {                                  name: "get_weather",
    type: "object",                                description: "...",
    properties: { ... },                           parameters: {
    required: ["location"]                            type: "object",
  }                                                   properties: { ... },
}                                                     required: ["location"]
                                                   }
                                                 }
                                               }

The translation is mechanical:

  • AnthropicTool.nametool.function.name
  • AnthropicTool.descriptiontool.function.description
  • AnthropicTool.input_schematool.function.parameters
  • type: 'function' is added as a literal constant
  • The whole tool moves inside a function: { ... } sub-object

4.4.2 Tool call (output)

Anthropic content block                       OpenAI tool_calls element
───────────────────────────────────────      ─────────────────────────────────────────
{                                              {
  type: "tool_use",                              id: "call_001",
  id: "tool_001",                                type: "function",
  name: "get_weather",                           function: {
  input: { location: "London" }                    name: "get_weather",
}                                                  arguments: "{\"location\":\"London\"}"
                                                 }
                                               }

Key differences:

  • Codex puts tool calls in choices[0].message.tool_calls[], not in the content array (Codex’s content is null when tools are used).
  • Codex’s function.arguments is a JSON string, not a parsed object. The adapter parses it before re-projecting to Anthropic shape.
  • Codex tool IDs are prefixed call_; Anthropic tool IDs are prefixed toolu_ or vary. The adapter preserves the Codex ID verbatim — IDs are opaque to downstream callers.

4.4.3 Response content normalisation

The Claude adapter’s parseResult extracts content[0].text for text responses and stringifies the whole content array for tool-use responses. The Codex adapter mirrors this contract by synthesising an Anthropic-shape content array when Codex returns tool_calls:

const anthropicShapeContent = toolCalls.map((c) => ({
  type: 'tool_use',
  id: c.id,
  name: c.function.name,
  input: JSON.parse(c.function.arguments),
}));
return JSON.stringify(anthropicShapeContent);

This keeps the CompletionResult.content field byte-shape-identical to what the Claude adapter returns for a tool-use response.

4.5 Finish-reason normalisation

The CompletionResult.stopReason field uses the Anthropic vocabulary ('end_turn', 'tool_use', 'max_tokens', 'stop_sequence'). Codex returns OpenAI vocabulary; we normalise:

Codex finish_reason Normalised stopReason
'stop' 'end_turn'
'tool_calls' 'tool_use'
'length' 'max_tokens'
'content_filter' 'content_filter' (passed through — no Anthropic equivalent)
'function_call' (deprecated) 'tool_use'
absent / null 'unknown'

Per the dispatch packet “Document any Codex-vs-Anthropic API divergences in the contract doc + tool-use mapping table.” — this table is reproduced in the contract doc §4.

4.6 Error response shape

Aspect Anthropic Codex Adapter handling
4xx body {type: 'error', error: {type, message}} {error: {message, type, param, code}} The adapter does NOT introspect either body — both throw with HTTP status only. Phase 1.5 keeps error parity at the HTTP-status level (4xx terminal, 429 retry, 5xx retry).
Rate-limit signalling 429 Too Many Requests 429 Too Many Requests Same — isRetryable returns true.

§5. Environment variables

The adapter reads exactly two env vars at call-time:

Variable Required Default Purpose
COLIBRI_CODEX_API_KEY call-time required none Bearer token for Authorization header. If absent, throws CodexConfigError.
COLIBRI_CODEX_BASE_URL optional 'https://api.openai.com/v1' (constant) Override for self-hosted or alternative gateways.

Both variables follow the call-time-validate pattern (CLAUDE.md §T0 decision 3), matching how ANTHROPIC_API_KEY is consumed by createCompletion. Neither is added to src/config.ts’s Zod schema — the Phase 0 schema floor stays minimal, and the adapter reads directly from process.env (or accepts an injected apiKey / baseUrl override in CodexCompletionOptions for tests).

The AMS_* namespace is forbidden by assertNoDonorNamespace in config.ts; the adapter never touches AMS_* keys.


§6. ModelId membership

The δ ModelId union (src/domains/router/scoring.ts:110-119) does not currently include 'codex'. The 9 members are:

'claude' | 'claude-sonnet-3-5' | 'claude-haiku-3-5' | 'gpt-4o' |
'gpt-4o-mini' | 'gemini-1-5-pro' | 'llama-3-3-70b' | 'mixtral-8x22b' |
'kimi-k2'

Consequence: the Codex adapter does NOT take a ModelId parameter. It is structurally a CompletionFn-compatible callable that returns a CompletionResult. The router’s ModelId-keyed dispatch (which lands later, in P1.5.5) will need a way to associate a ModelId value with the Codex adapter; that mapping is out of scope for P1.5.3.

For Phase 1.5 staging, we treat Codex as the OpenAI Codex API surface — same wire protocol as OpenAI Chat Completions — and accept that extending ModelId to include 'codex' is a future hygiene PR (P1.5.4 OpenAI may or may not subsume Codex; that decision is also out of scope here).

The adapter’s default model id (when options.model is absent) is 'gpt-4o-mini' — the lowest-cost member of the Codex/OpenAI cohort per the δ candidate cohort table (docs/3-world/social/llm.md §candidate cohort). The adapter does NOT hardcode this as const CODEX_VERSION; it falls back to options.model ?? DEFAULT_CODEX_MODEL.

Note on default selection: Production callers should pass options.model explicitly. The default is a safety net to prevent the adapter from issuing a request with a missing model field (which Codex would reject with HTTP 400).


§7. Injection seams

The adapter mirrors the Claude adapter’s three injection seams plus an additional URL-override:

Seam Purpose Default
fetchFn Mock HTTP transport in tests Global fetch (Node ≥ 20)
logger Stderr-only logger console.error
delayFn Sleep override for deterministic retry tests setTimeout-backed sleep
apiKey Override process.env.COLIBRI_CODEX_API_KEY process.env.COLIBRI_CODEX_API_KEY
baseUrl Override process.env.COLIBRI_CODEX_BASE_URL process.env.COLIBRI_CODEX_BASE_URL ?? CODEX_API_BASE

The baseUrl seam is new (not present in the Claude adapter). Rationale: the Claude API base URL is effectively a constant (Anthropic runs no alternate gateway), but Codex’s OpenAI-protocol surface is served by many compatible gateways (Azure OpenAI, vLLM, llama.cpp, LM Studio, Ollama with /v1/chat/completions shim, etc.). Test parity also benefits — tests can point at a localhost mock without monkey- patching the fetchFn.


§8. Logging

Per CLAUDE.md §9 + integrations §3 — stderr only. Never process.stdout (donor bug mitigation — StdioServerTransport owns stdout). Log line format mirrors the Claude adapter:

[codex] model=<resolved-model> prompt_tokens=N completion_tokens=N latency_ms=N

The prefix [codex] (lower-case, in square brackets) distinguishes Codex log lines from [claude] lines so a multi-provider log file remains greppable.


§9. Retry policy

Identical to the Claude adapter:

  • Retry on 429 + 5xx (isRetryable)
  • Max 3 retries (4 total attempts)
  • Exponential backoff: 100 ms → 200 ms → 400 ms
  • On retries exhausted: throw CodexApiError with code CODEX_RETRIES_EXHAUSTED
  • On network-level error (DNS fail, ECONNREFUSED): throw CodexApiError with status: undefined and code CODEX_API_ERROR

No deviation from the Claude policy is justified — the adapter pair must behave identically under load.


§10. Test plan (forward-look — fully spec’d in Step 3 packet)

Test file: src/__tests__/domains/router/adapters/codex.test.ts

Parallel to src/__tests__/domains/integrations/claude.test.ts — re-pointed at the Codex endpoint with OpenAI-shape response bodies.

Test groups:

  1. createCodexCompletion — success path (3–4 tests)
    • Returns CompletionResult with correct fields
    • POSTs to /chat/completions
    • Includes Authorization: Bearer <key> header (no Anthropic-version)
    • Request body shape: model + max_tokens + messages
    • System prompt placement: prepended to messages array
  2. createCodexCompletionWithTools — success path (2–3 tests)
    • Translates AnthropicTool[] to OpenAI tool shape
    • Empty tools array omits tools key
    • Tool-use response translated back to Anthropic content shape
  3. API key validation (2 tests)
    • Missing COLIBRI_CODEX_API_KEYCodexConfigError
    • Injected apiKey overrides process.env
  4. Retry logic (1–2 tests)
    • 429 retry with exponential backoff
  5. Finish-reason normalisation (1 test, table-driven)
    • All 5 mappings table-tested
  6. Tool-use mapping (1 test)
    • Codex tool_calls projected to Anthropic content shape; function.arguments (JSON string) is parsed

Total: 10–13 tests. Target test budget aligns with the dispatch packet “5–10 parity tests” — we exceed the upper bound deliberately because the tool-use mapping requires its own coverage layer that the Claude adapter does not need.


§11. Acceptance criteria (from dispatch packet, restated)

  • createCodexCompletion(prompt, options) → Promise<CompletionResult> matches Claude shape
  • createCodexCompletionWithTools(prompt, tools, options) → Promise<CompletionResult> matches Claude shape
  • Reads COLIBRI_CODEX_API_KEY at call-time (not import-time)
  • Reads COLIBRI_CODEX_BASE_URL with default to OpenAI Chat Completions URL
  • Translates Codex tool_calls response into Anthropic-SDK tool-shape
  • Injection seams fetchFn, logger, delayFn present (+ apiKey, baseUrl)
  • CodexApiError + CodexConfigError extend Error with shape parity to AnthropicApiError / AnthropicConfigError
  • 5–10 parity tests (this slice: 10–13)
  • No MCP tool registration
  • No mutation of src/domains/router/index.ts (CRITICAL OVERRIDE)
  • npm run build && npm run lint && npm test green
  • Zero regression vs main 89adef66 (3153 tests baseline)

§12. Risk register

Risk Severity Mitigation
Sibling parallel race on index.ts HIGH Adapter file is the only src/domains/router/ change. Re-export deferred to fold-in commit per dispatch packet CRITICAL OVERRIDE.
process.env.COLIBRI_CODEX_API_KEY leakage in test env LOW Tests inject apiKey explicitly; never rely on real env var. Test fixture uses 'sk-codex-test-fake-key'.
Codex base URL leakage to production LOW baseUrl override is opt-in; default constant is the public OpenAI endpoint.
JSON.parse on Codex tool args raising MEDIUM Adapter catches parse error, leaves arguments as the raw string (input: <string> instead of <object>). Logged at WARN. Documented behaviour in contract §4.4.3.
Network error tests flaky on Windows LOW Tests inject fetchFn for all network paths. No real fetch ever fires in the test suite.
ModelId lacks 'codex' LOW Adapter is ModelId-free at this layer. Router wiring is P1.5.5+ — not in scope.
enabled = 0 in candidate seed for OpenAI/Codex rows LOW The adapter is callable regardless of seed flag; the router chooses whether to call it. Not the adapter’s concern.
Test count baseline drift (3153 → 3153 + Δ) LOW Expected. New tests add to the suite; no existing tests should fail.

§13. Tool-use mapping table (will be repeated verbatim in the contract doc)

Direction Field on Anthropic side Field on Codex side Conversion
Req: tool def tools[i] flat tools[i].function nested Wrap inside {type:'function', function:{...}}
Req: tool def tools[i].name tools[i].function.name Move nested
Req: tool def tools[i].description tools[i].function.description Move nested
Req: tool def tools[i].input_schema tools[i].function.parameters Move + rename
Req: system prompt top-level system key first message {role:'system'} Restructure
Resp: text content content[i].text (where type:'text') choices[0].message.content Direct extract
Resp: tool call content[i] (where type:'tool_use') choices[0].message.tool_calls[j] Different array location
Resp: tool call id content[i].id tool_calls[j].id Pass-through (Codex prefix preserved)
Resp: tool call name content[i].name tool_calls[j].function.name Un-nest
Resp: tool call args content[i].input (object) tool_calls[j].function.arguments (JSON string) JSON.parse
Resp: finish reason stop_reason choices[0].finish_reason Table §4.5
Resp: prompt tokens usage.input_tokens usage.prompt_tokens Rename
Resp: completion tokens usage.output_tokens usage.completion_tokens Rename

§14. Step 1 exit gate

This audit is complete. Findings:

  • The Codex adapter’s surface is fully derivable from the Claude reference, modulo the OpenAI-vs-Anthropic wire-shape divergences documented in §4 + §13.
  • No imports outside process.env reads are required.
  • No new runtime dependency.
  • No src/domains/router/index.ts mutation.
  • No MCP tool registration.

Step 2 (contract) may proceed.


Commit message (per CLAUDE.md §6 template): audit(p1-5-3-codex-adapter): inventory adapter surface + Codex API divergences


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.