P1.5.2 Kimi K2 Adapter — Audit

Round: R92, Phase 1.5, Wave 3 (parallel slice 1/3) Base SHA: 89adef66 (post-P1.5.1 #252 merge) Branch: feature/p1-5-2-kimi-adapter Worktree: .worktrees/claude/p1-5-2-kimi-adapter

1. Goal

Inventory:

  1. The Phase 0 Claude adapter (src/domains/integrations/claude.ts) as the reference surface.
  2. The router’s CompletionFn contract (src/domains/router/fallback.ts) which the new Kimi adapter must satisfy.
  3. Kimi K2’s documented HTTP API (Moonshot AI) — the divergences from the Anthropic Messages API.

This sets up the contract (Step 2), packet (Step 3), and implementation (Step 4) for src/domains/router/adapters/kimi.ts.

2. Reference adapter — src/domains/integrations/claude.ts

2.1 Public surface

Symbol Kind Notes
AnthropicConfigError class Thrown when ANTHROPIC_API_KEY absent at call-time. code = 'ANTHROPIC_CONFIG_ERROR'.
AnthropicApiError class Thrown on terminal HTTP error or network failure. Carries status and code: 'ANTHROPIC_API_ERROR' \| 'ANTHROPIC_RETRIES_EXHAUSTED'.
AnthropicTool interface { name, description, input_schema } — matches Anthropic Messages API tools[].
CompletionOptions interface Options bag: model, maxTokens, systemPrompt, fetchFn, logger, delayFn, apiKey.
CompletionResult interface { content, model, promptTokens, completionTokens, latencyMs, stopReason }.
createCompletion(prompt, options) async fn POST /v1/messages with a single user message.
createCompletionWithTools(prompt, tools, options) async fn POST /v1/messages with a tools array.

2.2 Design invariants (mirrored verbatim for parity)

  1. Library-only. No MCP tools registered (P1.5.7 scope).
  2. No new runtime dependency. Uses global fetch (Node ≥ 20).
  3. Stderr-only logging. console.error / injected logger. Never process.stdout (donor bug mitigation — StdioServerTransport owns stdout).
  4. Injection seams. fetchFn, logger, delayFn all injectable.
  5. API key validation at call-time, not module-load-time. Server boots without it; throws KimiConfigError on first call if absent.
  6. Retry policy. 429 + 5xx → exponential backoff, max 3 retries, base delay 100ms (doubling).

2.3 Retry mechanics

  • isRetryable(status)status === 429 || (status >= 500 && status <= 599).
  • MAX_RETRIES = 3, BASE_DELAY_MS = 100, geometric doubling.
  • Network-level errors (DNS, conn refused) → AnthropicApiError immediately, NOT retried.

2.4 Logging shape

[claude] model=<model> prompt_tokens=<n> completion_tokens=<n> latency_ms=<ms>

Kimi adapter mirrors this with [kimi] prefix.

3. Router contract — src/domains/router/fallback.ts

3.1 CompletionFn signature

export type CompletionFn = (
  prompt: string,
  options: CompletionFnOptions,
) => Promise<CompletionResult>;

export interface CompletionFnOptions {
  readonly model?: string;
  readonly maxTokens?: number;
  readonly systemPrompt?: string;
  readonly apiKey?: string;
  readonly fetchFn?: typeof fetch;
  readonly logger?: (...args: unknown[]) => void;
  readonly delayFn?: (ms: number) => Promise<void>;
}

The Kimi adapter’s createKimiCompletion and createKimiCompletionWithTools must satisfy this so they are drop-in replacements for the Claude adapter in the fallback chain (Phase 1.5 W4 P1.5.5).

3.2 CompletionResult shape (from claude.ts)

export interface CompletionResult {
  readonly content: string;
  readonly model: string;
  readonly promptTokens: number;
  readonly completionTokens: number;
  readonly latencyMs: number;
  readonly stopReason: string;
}

Kimi adapter returns the SAME CompletionResult (imported via type re-export from ../integrations/claude.js).

4. Kimi K2 API — divergences from Anthropic

Kimi K2 (Moonshot AI) exposes an OpenAI-compatible Chat Completions endpoint. Documented base URL: https://api.moonshot.ai/v1. Path: /v1/chat/completions. Auth: Authorization: Bearer <key> header (NOT x-api-key).

4.1 Request shape

Anthropic (/v1/messages) Kimi K2 (/v1/chat/completions)
model: string model: string (e.g. kimi-k2-0905-preview)
max_tokens: number (required) max_tokens: number (optional; default 1024)
system: string (top-level field) messages[0] with role: 'system'
messages: [{role,content}] messages: [{role,content}]
tools: AnthropicTool[] ({name, description, input_schema}) tools: [{type: 'function', function: {name, description, parameters}}]
header x-api-key header Authorization: Bearer <key>
header anthropic-version: 2023-06-01 (none)

4.2 Response shape

Anthropic Kimi K2
content: [{type:'text',text}] or [{type:'tool_use',name,input,id}] choices[0].message.content: string (text) and/or choices[0].message.tool_calls: [{id,type:'function',function:{name,arguments}}]
stop_reason: 'end_turn'\|'tool_use'\|... choices[0].finish_reason: 'stop'\|'tool_calls'\|'length'\|...
usage.input_tokens usage.prompt_tokens
usage.output_tokens usage.completion_tokens
model: <id> model: <id>

4.3 Tool-use mapping (THE hard part)

Request side — AnthropicTool[] → KimiTool[]:

// Input: AnthropicTool { name, description, input_schema }
// Output: KimiTool { type: 'function', function: { name, description, parameters } }
{ type: 'function', function: { name, description, parameters: input_schema } }

Response side — Kimi tool_callscontent string (Anthropic-shape):

The Claude adapter today returns tool_use responses as JSON.stringify(content) where content is an array of {type:'tool_use', id, name, input} objects. The Kimi adapter must produce the SAME shape so downstream callers (the router, and future tool-orchestration logic) cannot tell adapters apart.

Mapping per tool_call:

// Input: { id, type: 'function', function: { name, arguments: <JSON-string> } }
// Output: { type: 'tool_use', id, name, input: JSON.parse(arguments) }
  • arguments field is a JSON-encoded string per OpenAI spec — must be JSON.parsed into an object for the input field.
  • Unknown tool names: per ready-to-paste prompt, log via options.logger + skip (don’t throw). In practice this means we always pass through; the “skip” handling lives one level up where the router consumes the parsed tool_calls.

4.4 Status code parity

Both Anthropic and Kimi K2 use standard HTTP semantics:

  • 401/403 → terminal (auth error).
  • 429 → retryable.
  • 5xx → retryable.
  • 4xx (other) → terminal.

Retry policy parity: 3 retries, exponential backoff, base delay 100ms. Same constants, same retryable predicate.

4.5 Env namespace

Provider Env vars
Anthropic ANTHROPIC_API_KEY (vendor canonical name, kept)
Kimi COLIBRI_KIMI_API_KEY (Colibri-namespaced — not a vendor product name like MOONSHOT_API_KEY, because the router treats the underlying provider as opaque)
Kimi base URL COLIBRI_KIMI_BASE_URL (optional; default https://api.moonshot.ai/v1)

Validation point: call-time (matches Design Invariant 5). Module load must NOT throw on missing key.

5. Tool-use mapping table — Kimi → Anthropic-shape

Kimi field Anthropic-shape field Notes
choices[0].message.content content[].text (when type === 'text') String. Coerce to '' if null/absent.
choices[0].message.tool_calls[i].id content[].id (when type === 'tool_use') String, used as the tool_use ID.
choices[0].message.tool_calls[i].function.name content[].name (when type === 'tool_use') Tool name.
choices[0].message.tool_calls[i].function.arguments content[].input JSON-string in Kimi, object in Anthropic. Must JSON.parse. Defensive: catch parse errors → { _parse_error: <msg>, _raw: <string> }.
choices[0].finish_reason: 'stop' stop_reason: 'end_turn' Mapped.
choices[0].finish_reason: 'tool_calls' stop_reason: 'tool_use' Mapped.
choices[0].finish_reason: 'length' stop_reason: 'max_tokens' Mapped.
choices[0].finish_reason: 'content_filter' stop_reason: 'refusal' Mapped. (best-effort)
choices[0].finish_reason: <other> stop_reason: <other> Passed through.
usage.prompt_tokens promptTokens Same semantics.
usage.completion_tokens completionTokens Same semantics.
model model Passed through.

Output content string when tool_calls are present: JSON.stringify(toolUseBlocks) where toolUseBlocks is the array of Anthropic-shape {type:'tool_use', id, name, input} objects — matching exactly what the Claude adapter returns today for tool_use responses (claude.ts:213).

6. Files surveyed

  • src/domains/integrations/claude.ts (392 lines) — reference adapter.
  • src/__tests__/domains/integrations/claude.test.ts (lines 1–500) — reference test patterns for makeMockFetch, makeSilentLogger, makeInstantDelay, baseOptions.
  • src/domains/router/fallback.ts (lines 1–356) — CompletionFn, CompletionFnOptions.
  • src/domains/router/scoring.ts (lines 100–137) — ModelId union (already includes 'kimi-k2').
  • src/domains/router/index.ts — barrel; NOT TOUCHED in this slice per parallel-T3 race override.
  • src/config.ts (lines 60–100) — call-time vs. module-load-time pattern.

7. Sibling-race scope override

The staging file (L241, L319-321 of p1.5-delta-router-graduation.md) instructs export * from './adapters/kimi.js'; in src/domains/router/index.ts. Two sibling parallel T3s (P1.5.3 Codex + P1.5.4 OpenAI) are landing the same line concurrently. The dispatch override (L19-25 of the task prompt) explicitly forbids touching index.ts in this slice. Re-export coordination lands in a fold-in commit between Wave 3 and Wave 4.

Consequence: tests in src/__tests__/domains/router/adapters/kimi.test.ts import via the relative path ../../../../domains/router/adapters/kimi.js, NOT via the barrel.

8. Forbiddens (echoed)

  • No MCP tool registration (P1.5.7 scope).
  • No AMS_* env vars — COLIBRI_KIMI_* only.
  • No edit of src/domains/router/index.ts (parallel-T3 race).
  • No edit of src/domains/router/scoring.ts (P1.5.1 scope).
  • No edit of src/domains/router/fallback.ts (P0.5.2 scope).
  • No edit of src/domains/integrations/claude.ts (P0.9.2 scope).
  • No hardcoded model version — take from options.model with documented default.
  • No fallback chain logic in the adapter (P1.5.5 scope).

9. Exit criteria for audit step

  • Reference adapter surface inventoried (§2).
  • Router CompletionFn contract documented (§3).
  • Kimi K2 API divergences listed (§4).
  • Tool-use mapping table drafted (§5).
  • Sibling-race scope override documented (§7).
  • Forbiddens echoed (§8).

Next: Step 2 (contract).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.