P1.5.6 — Cost Accounting — Execution Packet
Round: R92 Wave 5 of 7
Branch: feature/p1-5-6-cost
Base: origin/main @ c284ad22
Audit: p1-5-6-cost-audit.md
Contract: p1-5-6-cost-contract.md
1. Files to create / modify
| Path | Operation | Approx. LOC | Notes |
|---|---|---|---|
src/domains/router/cost.ts |
CREATE | ~280 | Module-level state map; computeCostUsd, recordRouterCall, getRouterStats, resetRouterStats, ROUTER_LATENCY_RING_SIZE. |
src/__tests__/domains/router/cost.test.ts |
CREATE | ~400 | ~25 tests covering all I-COST-* and I-AGG-* invariants. |
src/domains/router/fallback.ts |
MODIFY | +50/-10 | Append costUsd + modelsAttempted to RouteResult; wire recordRouterCall in 2 sites; track modelsAttempted across walk; populate from cost.ts. |
src/__tests__/domains/router/fallback.test.ts |
MODIFY | +120 | New describe block “RouteResult cost + modelsAttempted” with ~5 tests. |
src/domains/router/index.ts |
MODIFY | +1 | Barrel re-export of ./cost.js. |
2. Execution sequence
Step 4.A — Create src/domains/router/cost.ts
The module exports:
- Constant
ROUTER_LATENCY_RING_SIZE = 1000. - Type
RouterStats(frozen-shape). - Type
RouterCallRecord(input torecordRouterCall). - Function
computeCostUsd(modelId, p, c, snap?) → number. - Function
recordRouterCall(modelId, record) → void. - Function
getRouterStats() → { models: Readonly<Record<ModelId, RouterStats>> }. - Function
resetRouterStats(modelId?) → void.
Module-level state:
interface MutableAgg {
calls_total: number;
successes: number;
failures: number;
total_cost_bps_int: bigint; // sum across successes; pre-divided by 1000 (kilotokens), still in bps
latencies: number[]; // length capped at ROUTER_LATENCY_RING_SIZE
ringHead: number; // next write index, 0-indexed
ringFilled: boolean; // true after first wraparound
}
const aggMap: Map<ModelId, MutableAgg> = new Map();
getOrCreateAgg(modelId) lazily initialises an entry.
Step 4.B — Implement computeCostUsd
export function computeCostUsd(
modelId: ModelId,
promptTokens: number,
completionTokens: number,
candidatesSnapshot?: ReadonlyArray<ModelCandidate>,
): number {
if (
!Number.isFinite(promptTokens) ||
!Number.isFinite(completionTokens) ||
promptTokens < 0 ||
completionTokens < 0
) {
return 0;
}
const totalTokens = promptTokens + completionTokens;
if (totalTokens <= 0) {return 0;}
if (candidatesSnapshot === undefined) {return 0;}
const c = candidatesSnapshot.find((row) => row.model_id === modelId);
if (c === undefined) {return 0;}
const bps = c.cost_bps_per_kilotoken;
if (!Number.isFinite(bps) || bps <= 0) {return 0;}
// Integer-bps math; single divide at the edge.
// bps_int = (tokens * bps_per_kilotoken) / 1000n
// usd = bps_int / 10000
const bpsInt = (BigInt(totalTokens) * BigInt(bps)) / 1000n;
return Number(bpsInt) / 10000;
}
Step 4.C — Implement recordRouterCall
export function recordRouterCall(
modelId: ModelId,
record: RouterCallRecord,
): void {
const agg = getOrCreateAgg(modelId);
agg.calls_total += 1;
if (record.success) {
agg.successes += 1;
if (
Number.isFinite(record.promptTokens) &&
Number.isFinite(record.completionTokens) &&
record.promptTokens >= 0 &&
record.completionTokens >= 0
) {
const totalTokens = record.promptTokens + record.completionTokens;
if (totalTokens > 0 && record.candidatesSnapshot !== undefined) {
const c = record.candidatesSnapshot.find((r) => r.model_id === modelId);
if (c !== undefined) {
const bps = c.cost_bps_per_kilotoken;
if (Number.isFinite(bps) && bps > 0) {
agg.total_cost_bps_int +=
(BigInt(totalTokens) * BigInt(bps)) / 1000n;
}
}
}
}
} else {
agg.failures += 1;
}
// Ring buffer write (always — both success and failure latencies count).
const lat =
Number.isFinite(record.latencyMs) && record.latencyMs >= 0
? record.latencyMs
: 0;
if (agg.latencies.length < ROUTER_LATENCY_RING_SIZE) {
agg.latencies.push(lat);
} else {
agg.latencies[agg.ringHead] = lat;
}
agg.ringHead = (agg.ringHead + 1) % ROUTER_LATENCY_RING_SIZE;
if (agg.ringHead === 0 && agg.latencies.length === ROUTER_LATENCY_RING_SIZE) {
agg.ringFilled = true;
}
}
Step 4.D — Implement getRouterStats
export function getRouterStats(): {
readonly models: Readonly<Record<ModelId, RouterStats>>;
} {
const out: Partial<Record<ModelId, RouterStats>> = {};
for (const [modelId, agg] of aggMap) {
const avg =
agg.successes === 0
? 0
: Number(agg.total_cost_bps_int) / 10000 / agg.successes;
const successRate =
agg.calls_total === 0 ? 0 : agg.successes / agg.calls_total;
const p50 = computeP50(agg.latencies);
out[modelId] = Object.freeze({
calls_total: agg.calls_total,
successes: agg.successes,
failures: agg.failures,
avg_cost_usd: avg,
p50_latency_ms: p50,
success_rate: successRate,
});
}
return Object.freeze({
models: Object.freeze(out as Readonly<Record<ModelId, RouterStats>>),
});
}
computeP50(latencies):
function computeP50(latencies: ReadonlyArray<number>): number {
if (latencies.length === 0) {return 0;}
const sorted = [...latencies].sort((a, b) => a - b);
// Lower-of-two median (concept doc gotcha: identical-across-arbiters).
const idx =
sorted.length % 2 === 1
? Math.floor(sorted.length / 2)
: sorted.length / 2 - 1;
return sorted[idx] ?? 0;
}
Step 4.E — Implement resetRouterStats
export function resetRouterStats(modelId?: ModelId): void {
if (modelId === undefined) {
aggMap.clear();
return;
}
aggMap.delete(modelId);
}
Step 4.F — Modify src/domains/router/fallback.ts
Steps in order:
- Import
recordRouterCall,computeCostUsdfrom./cost.js. - Append
costUsd: number+modelsAttempted: ReadonlyArray<ModelId>toRouteResultinterface. - Inside
routeRequest, initialiseconst modelsAttempted: ModelId[] = [];adjacent toattempts. - Each chain iteration push the
modelIdintomodelsAttemptedimmediately after the CB/NoAdapter pre-flight bailouts succeed (i.e. on every iteration that proceeds to the adapter call, success or failure). - Capture
attemptStartbefore the adapter call. - On success: call
recordRouterCall(modelId, {…, success: true, candidatesSnapshot: options.candidatesSnapshot})BEFORE the return; computecostUsd = computeCostUsd(modelId, upstream.promptTokens, upstream.completionTokens, options.candidatesSnapshot); include in the frozen result. - On failure: compute
measuredMs = (nowFn ?? Date.now)() - attemptStart; callrecordRouterCall(modelId, {…, success: false, candidatesSnapshot: options.candidatesSnapshot}); push toattempts. - On
CircuitOpenError/NoAdapterErrorshort-circuit: do NOT push tomodelsAttempted, do NOT callrecordRouterCall. The model never reached the adapter — neither attempted nor costed.
Step 4.G — Modify src/__tests__/domains/router/fallback.test.ts
Add a new describe block at the end of the file (just before the final }); of the outer describe):
describe('routeRequest — cost + modelsAttempted (P1.5.6)', () => {
beforeEach(() => { resetRouterStats(); });
afterEach(() => { resetRouterStats(); });
test('happy path: RouteResult.costUsd set from candidate snapshot', /* ... */);
test('happy path: modelsAttempted === [winner] on single-attempt success', /* ... */);
test('cascade: modelsAttempted lists both attempts in chain order', /* ... */);
test('cascade: costUsd reflects winner\'s row, not failed attempt\'s row', /* ... */);
test('circuit-open skip does NOT contribute to modelsAttempted', /* ... */);
test('RouteResult is frozen including new fields', /* ... */);
});
Step 4.H — Modify src/domains/router/index.ts
export * from './scoring.js';
export * from './fallback.js';
export * from './adapters/codex.js';
export * from './adapters/kimi.js';
export * from './adapters/openai.js';
export * from './cost.js'; // P1.5.6 addition
Step 4.I — Create src/__tests__/domains/router/cost.test.ts
Test file covers I-COST-1..9 and I-AGG-1..11. Skeleton:
import {
ROUTER_LATENCY_RING_SIZE,
computeCostUsd,
recordRouterCall,
getRouterStats,
resetRouterStats,
type RouterStats,
} from '../../../domains/router/cost.js';
import type { ModelCandidate, ModelId } from '../../../domains/router/scoring.js';
const SONNET: ModelCandidate = {
model_id: 'claude-sonnet-3-5',
provider: 'anthropic',
context_window_tokens: 200_000,
latency_tier: 'balanced',
cost_bps_per_kilotoken: 300,
domain_fit_profile: 139,
enabled: true,
};
const KIMI: ModelCandidate = {
model_id: 'kimi-k2',
provider: 'moonshot',
context_window_tokens: 200_000,
latency_tier: 'balanced',
cost_bps_per_kilotoken: 120,
domain_fit_profile: 73,
enabled: false,
};
const SNAPSHOT: ReadonlyArray<ModelCandidate> = Object.freeze([SONNET, KIMI]);
beforeEach(() => resetRouterStats());
afterEach(() => resetRouterStats());
// ... ~25 tests
3. Test count delta
Estimated additions:
cost.test.ts: ~25 new tests.fallback.test.ts: ~6 new tests in the cost block.- No deletions (additive change).
Expected: 3275 → ~3306. Final count reported in verification doc.
4. Risk register (delta vs. audit §12)
No new risks beyond what the audit captured.
5. Out-of-scope reminder
router_statsMCP tool → P1.5.7.- Cost parity across arbiters → P1.5.8.
- ζ trail recording → P1.5.10.
- DB persistence → Phase 2+.
fallbackDepthfield → not in dispatch.
6. Test invocation
cd .worktrees/claude/p1-5-6-cost
npm run build && npm run lint && npm test
All three are gates. Per CLAUDE.md §5 “All three are gates”. Pre-existing flake (startup — subprocess smoke) retry-clean if it hits.
7. Commit sequence (5 chain commits)
| # | Subject | Files |
|---|---|---|
| 1 | audit(p1-5-6-cost): inventory cost-accounting surface |
docs/audits/p1-5-6-cost-audit.md |
| 2 | contract(p1-5-6-cost): behavioral contract for cost accounting |
docs/contracts/p1-5-6-cost-contract.md |
| 3 | packet(p1-5-6-cost): execution plan |
docs/packets/p1-5-6-cost-packet.md |
| 4 | feat(p1-5-6-cost): real cost accounting on RouteResult (no stubs) |
src/domains/router/cost.ts + src/__tests__/domains/router/cost.test.ts + src/domains/router/fallback.ts + src/__tests__/domains/router/fallback.test.ts + src/domains/router/index.ts |
| 5 | verify(p1-5-6-cost): test evidence |
docs/verification/p1-5-6-cost-verification.md |