P1.5.6 — Cost Accounting — Execution Packet

Round: R92 Wave 5 of 7 Branch: feature/p1-5-6-cost Base: origin/main @ c284ad22 Audit: p1-5-6-cost-audit.md Contract: p1-5-6-cost-contract.md

1. Files to create / modify

Path Operation Approx. LOC Notes
src/domains/router/cost.ts CREATE ~280 Module-level state map; computeCostUsd, recordRouterCall, getRouterStats, resetRouterStats, ROUTER_LATENCY_RING_SIZE.
src/__tests__/domains/router/cost.test.ts CREATE ~400 ~25 tests covering all I-COST-* and I-AGG-* invariants.
src/domains/router/fallback.ts MODIFY +50/-10 Append costUsd + modelsAttempted to RouteResult; wire recordRouterCall in 2 sites; track modelsAttempted across walk; populate from cost.ts.
src/__tests__/domains/router/fallback.test.ts MODIFY +120 New describe block “RouteResult cost + modelsAttempted” with ~5 tests.
src/domains/router/index.ts MODIFY +1 Barrel re-export of ./cost.js.

2. Execution sequence

Step 4.A — Create src/domains/router/cost.ts

The module exports:

  • Constant ROUTER_LATENCY_RING_SIZE = 1000.
  • Type RouterStats (frozen-shape).
  • Type RouterCallRecord (input to recordRouterCall).
  • Function computeCostUsd(modelId, p, c, snap?) → number.
  • Function recordRouterCall(modelId, record) → void.
  • Function getRouterStats() → { models: Readonly<Record<ModelId, RouterStats>> }.
  • Function resetRouterStats(modelId?) → void.

Module-level state:

interface MutableAgg {
  calls_total: number;
  successes: number;
  failures: number;
  total_cost_bps_int: bigint;          // sum across successes; pre-divided by 1000 (kilotokens), still in bps
  latencies: number[];                 // length capped at ROUTER_LATENCY_RING_SIZE
  ringHead: number;                    // next write index, 0-indexed
  ringFilled: boolean;                 // true after first wraparound
}
const aggMap: Map<ModelId, MutableAgg> = new Map();

getOrCreateAgg(modelId) lazily initialises an entry.

Step 4.B — Implement computeCostUsd

export function computeCostUsd(
  modelId: ModelId,
  promptTokens: number,
  completionTokens: number,
  candidatesSnapshot?: ReadonlyArray<ModelCandidate>,
): number {
  if (
    !Number.isFinite(promptTokens) ||
    !Number.isFinite(completionTokens) ||
    promptTokens < 0 ||
    completionTokens < 0
  ) {
    return 0;
  }
  const totalTokens = promptTokens + completionTokens;
  if (totalTokens <= 0) {return 0;}
  if (candidatesSnapshot === undefined) {return 0;}
  const c = candidatesSnapshot.find((row) => row.model_id === modelId);
  if (c === undefined) {return 0;}
  const bps = c.cost_bps_per_kilotoken;
  if (!Number.isFinite(bps) || bps <= 0) {return 0;}
  // Integer-bps math; single divide at the edge.
  //   bps_int = (tokens * bps_per_kilotoken) / 1000n
  //   usd     = bps_int / 10000
  const bpsInt = (BigInt(totalTokens) * BigInt(bps)) / 1000n;
  return Number(bpsInt) / 10000;
}

Step 4.C — Implement recordRouterCall

export function recordRouterCall(
  modelId: ModelId,
  record: RouterCallRecord,
): void {
  const agg = getOrCreateAgg(modelId);
  agg.calls_total += 1;
  if (record.success) {
    agg.successes += 1;
    if (
      Number.isFinite(record.promptTokens) &&
      Number.isFinite(record.completionTokens) &&
      record.promptTokens >= 0 &&
      record.completionTokens >= 0
    ) {
      const totalTokens = record.promptTokens + record.completionTokens;
      if (totalTokens > 0 && record.candidatesSnapshot !== undefined) {
        const c = record.candidatesSnapshot.find((r) => r.model_id === modelId);
        if (c !== undefined) {
          const bps = c.cost_bps_per_kilotoken;
          if (Number.isFinite(bps) && bps > 0) {
            agg.total_cost_bps_int +=
              (BigInt(totalTokens) * BigInt(bps)) / 1000n;
          }
        }
      }
    }
  } else {
    agg.failures += 1;
  }

  // Ring buffer write (always — both success and failure latencies count).
  const lat =
    Number.isFinite(record.latencyMs) && record.latencyMs >= 0
      ? record.latencyMs
      : 0;
  if (agg.latencies.length < ROUTER_LATENCY_RING_SIZE) {
    agg.latencies.push(lat);
  } else {
    agg.latencies[agg.ringHead] = lat;
  }
  agg.ringHead = (agg.ringHead + 1) % ROUTER_LATENCY_RING_SIZE;
  if (agg.ringHead === 0 && agg.latencies.length === ROUTER_LATENCY_RING_SIZE) {
    agg.ringFilled = true;
  }
}

Step 4.D — Implement getRouterStats

export function getRouterStats(): {
  readonly models: Readonly<Record<ModelId, RouterStats>>;
} {
  const out: Partial<Record<ModelId, RouterStats>> = {};
  for (const [modelId, agg] of aggMap) {
    const avg =
      agg.successes === 0
        ? 0
        : Number(agg.total_cost_bps_int) / 10000 / agg.successes;
    const successRate =
      agg.calls_total === 0 ? 0 : agg.successes / agg.calls_total;
    const p50 = computeP50(agg.latencies);
    out[modelId] = Object.freeze({
      calls_total: agg.calls_total,
      successes: agg.successes,
      failures: agg.failures,
      avg_cost_usd: avg,
      p50_latency_ms: p50,
      success_rate: successRate,
    });
  }
  return Object.freeze({
    models: Object.freeze(out as Readonly<Record<ModelId, RouterStats>>),
  });
}

computeP50(latencies):

function computeP50(latencies: ReadonlyArray<number>): number {
  if (latencies.length === 0) {return 0;}
  const sorted = [...latencies].sort((a, b) => a - b);
  // Lower-of-two median (concept doc gotcha: identical-across-arbiters).
  const idx =
    sorted.length % 2 === 1
      ? Math.floor(sorted.length / 2)
      : sorted.length / 2 - 1;
  return sorted[idx] ?? 0;
}

Step 4.E — Implement resetRouterStats

export function resetRouterStats(modelId?: ModelId): void {
  if (modelId === undefined) {
    aggMap.clear();
    return;
  }
  aggMap.delete(modelId);
}

Step 4.F — Modify src/domains/router/fallback.ts

Steps in order:

  1. Import recordRouterCall, computeCostUsd from ./cost.js.
  2. Append costUsd: number + modelsAttempted: ReadonlyArray<ModelId> to RouteResult interface.
  3. Inside routeRequest, initialise const modelsAttempted: ModelId[] = []; adjacent to attempts.
  4. Each chain iteration push the modelId into modelsAttempted immediately after the CB/NoAdapter pre-flight bailouts succeed (i.e. on every iteration that proceeds to the adapter call, success or failure).
  5. Capture attemptStart before the adapter call.
  6. On success: call recordRouterCall(modelId, {…, success: true, candidatesSnapshot: options.candidatesSnapshot}) BEFORE the return; compute costUsd = computeCostUsd(modelId, upstream.promptTokens, upstream.completionTokens, options.candidatesSnapshot); include in the frozen result.
  7. On failure: compute measuredMs = (nowFn ?? Date.now)() - attemptStart; call recordRouterCall(modelId, {…, success: false, candidatesSnapshot: options.candidatesSnapshot}); push to attempts.
  8. On CircuitOpenError / NoAdapterError short-circuit: do NOT push to modelsAttempted, do NOT call recordRouterCall. The model never reached the adapter — neither attempted nor costed.

Step 4.G — Modify src/__tests__/domains/router/fallback.test.ts

Add a new describe block at the end of the file (just before the final }); of the outer describe):

describe('routeRequest — cost + modelsAttempted (P1.5.6)', () => {
  beforeEach(() => { resetRouterStats(); });
  afterEach(() => { resetRouterStats(); });

  test('happy path: RouteResult.costUsd set from candidate snapshot', /* ... */);
  test('happy path: modelsAttempted === [winner] on single-attempt success', /* ... */);
  test('cascade: modelsAttempted lists both attempts in chain order', /* ... */);
  test('cascade: costUsd reflects winner\'s row, not failed attempt\'s row', /* ... */);
  test('circuit-open skip does NOT contribute to modelsAttempted', /* ... */);
  test('RouteResult is frozen including new fields', /* ... */);
});

Step 4.H — Modify src/domains/router/index.ts

export * from './scoring.js';
export * from './fallback.js';
export * from './adapters/codex.js';
export * from './adapters/kimi.js';
export * from './adapters/openai.js';
export * from './cost.js';  // P1.5.6 addition

Step 4.I — Create src/__tests__/domains/router/cost.test.ts

Test file covers I-COST-1..9 and I-AGG-1..11. Skeleton:

import {
  ROUTER_LATENCY_RING_SIZE,
  computeCostUsd,
  recordRouterCall,
  getRouterStats,
  resetRouterStats,
  type RouterStats,
} from '../../../domains/router/cost.js';
import type { ModelCandidate, ModelId } from '../../../domains/router/scoring.js';

const SONNET: ModelCandidate = {
  model_id: 'claude-sonnet-3-5',
  provider: 'anthropic',
  context_window_tokens: 200_000,
  latency_tier: 'balanced',
  cost_bps_per_kilotoken: 300,
  domain_fit_profile: 139,
  enabled: true,
};

const KIMI: ModelCandidate = {
  model_id: 'kimi-k2',
  provider: 'moonshot',
  context_window_tokens: 200_000,
  latency_tier: 'balanced',
  cost_bps_per_kilotoken: 120,
  domain_fit_profile: 73,
  enabled: false,
};

const SNAPSHOT: ReadonlyArray<ModelCandidate> = Object.freeze([SONNET, KIMI]);

beforeEach(() => resetRouterStats());
afterEach(() => resetRouterStats());

// ... ~25 tests

3. Test count delta

Estimated additions:

  • cost.test.ts: ~25 new tests.
  • fallback.test.ts: ~6 new tests in the cost block.
  • No deletions (additive change).

Expected: 3275 → ~3306. Final count reported in verification doc.

4. Risk register (delta vs. audit §12)

No new risks beyond what the audit captured.

5. Out-of-scope reminder

  • router_stats MCP tool → P1.5.7.
  • Cost parity across arbiters → P1.5.8.
  • ζ trail recording → P1.5.10.
  • DB persistence → Phase 2+.
  • fallbackDepth field → not in dispatch.

6. Test invocation

cd .worktrees/claude/p1-5-6-cost
npm run build && npm run lint && npm test

All three are gates. Per CLAUDE.md §5 “All three are gates”. Pre-existing flake (startup — subprocess smoke) retry-clean if it hits.

7. Commit sequence (5 chain commits)

# Subject Files
1 audit(p1-5-6-cost): inventory cost-accounting surface docs/audits/p1-5-6-cost-audit.md
2 contract(p1-5-6-cost): behavioral contract for cost accounting docs/contracts/p1-5-6-cost-contract.md
3 packet(p1-5-6-cost): execution plan docs/packets/p1-5-6-cost-packet.md
4 feat(p1-5-6-cost): real cost accounting on RouteResult (no stubs) src/domains/router/cost.ts + src/__tests__/domains/router/cost.test.ts + src/domains/router/fallback.ts + src/__tests__/domains/router/fallback.test.ts + src/domains/router/index.ts
5 verify(p1-5-6-cost): test evidence docs/verification/p1-5-6-cost-verification.md

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.