R76.P2 Verification — δ Phase 1.5 Graduation Plan

Step 5 of the 5-step chain. Evidence that all contract invariants hold.

1. Invariant-by-invariant evidence

A1 — Sub-task count = 10

$ grep -c "^## P1\.5\." docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md
10

Pass. P1.5.1 through P1.5.10 present.

A2 — Every sub-task has all 6 required sections

Manual review of each of the 10 sub-task blocks confirms every block has:

Header with spec source + ADR anchor + worktree + branch command + effort + depends-on + unblocks.
Files to create or Files to modify list.
Acceptance criteria (checkbox list of 6+ items).
Pre-flight reading list.
Ready-to-paste agent prompt inside a text fence.
Verification checklist (for reviewer agent).
Writeback template (YAML).
Common gotchas.

Pass.

A3 — Sub-task scopes match R76.P2 brief 1:1

Sub-task	Scope in brief	Scope in prompts file	Match
P1.5.1	Real intent scoring (7-dim formula)	Real 7-dim intent scoring	✓
P1.5.2	Adapter: Kimi K2	Adapter: Kimi K2	✓
P1.5.3	Adapter: Codex	Adapter: Codex	✓
P1.5.4	Adapter: OpenAI (GPT-4o family)	Adapter: OpenAI (GPT-4o family)	✓
P1.5.5	N-member fallback + circuit breaker	N-member Fallback Chain + Circuit Breaker	✓
P1.5.6	Cost accounting	Cost Accounting	✓
P1.5.7	router_* MCP tools	router_* MCP Tools (4 tools)	✓
P1.5.8	Cross-model parity test suite	Cross-Model Parity Test Suite	✓
P1.5.9	Model candidates table	Model Candidates Table Population	✓
P1.5.10	ζ decision-trail integration	ζ Decision-Trail Integration	✓

Pass.

A4 — Phase 0 interface freeze respected

Every sub-task block enumerates its Forbiddens and repeats the “signature frozen” rule. P1.5.1 (scoring), P1.5.5 (fallback), P1.5.6 (cost) are the three sub-tasks that touch existing files; each one explicitly lists the exports whose signatures must not change:

scoreIntent, ModelId, ScoreContext, IntentScore.
routeRequest, FallbackChainExhaustedError, RouteOptions, RouteResult, FallbackAttempt, ROUTER_PHASE_0_SHAPE, CompletionFn, ScoringFn.

ModelId widens additively (P1.5.1); ROUTER_PHASE_0_SHAPE literals flip (P1.5.5 — explicit signal); RouteResult gains appended fields only (P1.5.6). No signature is rewritten.

Pass.

A5 — `COLIBRI_*` namespace rigour

$ grep -n "AMS_MODEL\|AMS_KIMI\|AMS_CODEX\|AMS_OPENAI" docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md
827:- **`COLIBRI_MODEL_TIMEOUT`, not `AMS_MODEL_TIMEOUT`.** Donor namespace is forbidden.

The one hit is an intentional negative reference in P1.5.5’s Common Gotchas (“use COLIBRI_MODEL_TIMEOUT, not AMS_MODEL_TIMEOUT”) — teaching the executor that the donor namespace is forbidden. No live AMS_* var is referenced as a valid env var anywhere in the file. Prompts use:

COLIBRI_KIMI_API_KEY, COLIBRI_KIMI_BASE_URL.
COLIBRI_CODEX_API_KEY, COLIBRI_CODEX_BASE_URL.
COLIBRI_OPENAI_API_KEY, COLIBRI_OPENAI_BASE_URL.
COLIBRI_MODEL_TIMEOUT.

Pass.

A6 — 7-dimension scoring table matches concept doc

P1.5.1 embeds the exact 7-row dimension table with the same column order (Dimension, Formula, Default weight) and the same 7 dimensions named:

task_domain_match (0.20) · context_window_fit (0.15) · cost_efficiency (0.15) · latency_fit (0.15) · reliability (0.15) · skill_match (0.15) · operator_preference (0.05).

Weights sum to 1.0. Tie-break order (reliability, cost, alphabetical) reproduced verbatim. Worked example (Sonnet 0.87, GPT-4o 0.79, Haiku 0.58) reproduced as the golden-path test vector in P1.5.1 acceptance.

Pass.

A7 — 8-model candidate cohort matches concept doc

P1.5.9 enumerates all 8 candidates from the concept doc §Phase 1.5 candidate cohort: Claude 3.5 Sonnet, Claude 3.5 Haiku, GPT-4o, GPT-4o mini, Gemini 1.5 Pro, Llama 3.3 70B, Mixtral 8x22B, Kimi K2. The audit doc §4 carries the full indicative-cost/latency table; the prompts file references it at the candidate-seed acceptance level.

Pass.

A8 — ζ decision-trail shape matches concept doc

P1.5.10 embeds the exact JSON shape from the concept doc §Decision-trail recording, all 8 fields verbatim:

type, routing_mode, chosen_model_id, candidates_considered, scores,
fallback_attempts, rule_version_hash, decision_hash

decision_hash = SHA-256(inputs || chosen) rule preserved; routing_mode enum preserved (single | ensemble | pipeline | fail).

Pass.

A9 — 4 `router_*` tools in P1.5.7

$ grep -o "router_score\|router_call\|router_fallback\|router_stats" \
   docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md \
   | sort | uniq -c
     11 router_call
     10 router_fallback
      8 router_score
     16 router_stats

All 4 tools present; each referenced multiple times (in group summary, P1.5.7 block, and cross-references from P1.5.8 parity suite + P1.5.10 ζ emission).

Pass.

A10 — Circuit-breaker rules exact

P1.5.5 block states:

Trip: 3 consecutive failures on the same model_id.
Window: 60s cooldown.
Reset: time-bound, per-model, not success-bound (explicit Common Gotcha warns against reading an older “resets on success” reference).
Per-attempt timeout: 30s default via COLIBRI_MODEL_TIMEOUT.
In-memory state (not DB-backed in this round).

Pass.

A11 — `index.md` updated with Phase 1.5 section

docs/guides/implementation/task-prompts/index.md now has:

## Phase 1.5 — δ Multi-Model Router (R91+, planned)

Once Phase 1 κ Rule Engine ships and the four ADR-005 §Implementation
trigger conditions hold, the prompts below graduate δ from Phase 0 library
stubs to multi-model routing. ...

| Group | File | Sub-tasks |
|-------|------|-----------|
| P1.5 — δ Router Graduation | [p1.5-delta-router-graduation.md](...) | P1.5.1 – P1.5.10 |

The 28-task Phase 0 table is untouched.

Pass.

A12 — ADR-005 Wave I postscript updated

Appended one line before the closing signature line:

**Phase 1.5 graduation prompts:** see
[`docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md`](...)
for the 10-sub-task plan (P1.5.1–P1.5.10) activating at R91+.

No other ADR content changed.

Pass.

A13 — No `src/` changes

$ git status --porcelain | grep -E "^\s?[MA]\s+src/"
(no output)

Untracked / modified files are all under docs/. Zero src/ touches.

Pass.

A14 — Test gates

$ npm run build
> tsc
(clean; no output)

$ npm run lint
> eslint src
(clean; no output)

$ npm test
Test Suites: 1 failed, 25 passed, 26 total
Tests:       1 failed, 1084 passed, 1085 total
Time:        42.714 s

The one failure is the pre-existing startup.test.ts subprocess smoke flake under full-suite load — confirmed non-regression by isolation run:

$ npm test -- --testPathPattern="startup.test.ts"
Test Suites: 1 passed, 1 total
Tests:       40 passed, 40 total
Time:        15.001 s

The same flake is documented in C:\Users\Kamal\.claude\projects\E--AMS\memory\MEMORY.md (“Pre-existing startup — subprocess smoke flakiness under full-suite load — predates Wave H”) and is not introduced by this round’s docs-only edits.

Pass (with documented pre-existing flake).

2. Deliverable tree

All files are present in the worktree:

docs/
├── audits/r76-p2-delta-15-planning-audit.md                  ← step 1 (written)
├── contracts/r76-p2-delta-15-planning-contract.md            ← step 2 (written)
├── packets/r76-p2-delta-15-planning-packet.md                ← step 3 (written)
├── guides/implementation/task-prompts/
│   ├── p1.5-delta-router-graduation.md                       ← step 4 (NEW, 10 sub-tasks)
│   └── index.md                                              ← step 4 (edited)
├── architecture/decisions/
│   └── ADR-005-multi-model-defer.md                          ← step 4 (1-line append)
└── verification/r76-p2-delta-15-planning-verification.md     ← step 5 (this file)

3. Summary

All 14 contract invariants hold.
npm run build && npm run lint && npm test exit statuses: 0 / 0 / 1 (1 pre-existing flake; 1084/1085 passing, same as e455b00a baseline — no regression).
10 sub-task prompts ready for R91+ dispatch.
ADR-005 now points readers directly at the graduation plan.
No src/ modifications in this round.

R76.P2 complete. Proceeding to commit + PR per the task brief.

Written 2026-04-18 as step 5 of the R76.P2 5-step chain.

R76.P2 Verification — δ Phase 1.5 Graduation Plan

1. Invariant-by-invariant evidence

A1 — Sub-task count = 10

A2 — Every sub-task has all 6 required sections

A3 — Sub-task scopes match R76.P2 brief 1:1

A4 — Phase 0 interface freeze respected

A5 — COLIBRI_* namespace rigour

A6 — 7-dimension scoring table matches concept doc

A7 — 8-model candidate cohort matches concept doc

A8 — ζ decision-trail shape matches concept doc

A9 — 4 router_* tools in P1.5.7

A10 — Circuit-breaker rules exact

A11 — index.md updated with Phase 1.5 section

A12 — ADR-005 Wave I postscript updated

A13 — No src/ changes

A14 — Test gates

2. Deliverable tree

3. Summary

A5 — `COLIBRI_*` namespace rigour

A9 — 4 `router_*` tools in P1.5.7

A11 — `index.md` updated with Phase 1.5 section

A13 — No `src/` changes