R76.P2 Verification — δ Phase 1.5 Graduation Plan
Step 5 of the 5-step chain. Evidence that all contract invariants hold.
1. Invariant-by-invariant evidence
A1 — Sub-task count = 10
$ grep -c "^## P1\.5\." docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md
10
Pass. P1.5.1 through P1.5.10 present.
A2 — Every sub-task has all 6 required sections
Manual review of each of the 10 sub-task blocks confirms every block has:
- Header with spec source + ADR anchor + worktree + branch command + effort + depends-on + unblocks.
- Files to create or Files to modify list.
- Acceptance criteria (checkbox list of 6+ items).
- Pre-flight reading list.
- Ready-to-paste agent prompt inside a
textfence. - Verification checklist (for reviewer agent).
- Writeback template (YAML).
- Common gotchas.
Pass.
A3 — Sub-task scopes match R76.P2 brief 1:1
| Sub-task | Scope in brief | Scope in prompts file | Match |
|---|---|---|---|
| P1.5.1 | Real intent scoring (7-dim formula) | Real 7-dim intent scoring | ✓ |
| P1.5.2 | Adapter: Kimi K2 | Adapter: Kimi K2 | ✓ |
| P1.5.3 | Adapter: Codex | Adapter: Codex | ✓ |
| P1.5.4 | Adapter: OpenAI (GPT-4o family) | Adapter: OpenAI (GPT-4o family) | ✓ |
| P1.5.5 | N-member fallback + circuit breaker | N-member Fallback Chain + Circuit Breaker | ✓ |
| P1.5.6 | Cost accounting | Cost Accounting | ✓ |
| P1.5.7 | router_* MCP tools | router_* MCP Tools (4 tools) | ✓ |
| P1.5.8 | Cross-model parity test suite | Cross-Model Parity Test Suite | ✓ |
| P1.5.9 | Model candidates table | Model Candidates Table Population | ✓ |
| P1.5.10 | ζ decision-trail integration | ζ Decision-Trail Integration | ✓ |
Pass.
A4 — Phase 0 interface freeze respected
Every sub-task block enumerates its Forbiddens and repeats the “signature frozen” rule. P1.5.1 (scoring), P1.5.5 (fallback), P1.5.6 (cost) are the three sub-tasks that touch existing files; each one explicitly lists the exports whose signatures must not change:
scoreIntent,ModelId,ScoreContext,IntentScore.routeRequest,FallbackChainExhaustedError,RouteOptions,RouteResult,FallbackAttempt,ROUTER_PHASE_0_SHAPE,CompletionFn,ScoringFn.
ModelId widens additively (P1.5.1); ROUTER_PHASE_0_SHAPE literals flip
(P1.5.5 — explicit signal); RouteResult gains appended fields only
(P1.5.6). No signature is rewritten.
Pass.
A5 — COLIBRI_* namespace rigour
$ grep -n "AMS_MODEL\|AMS_KIMI\|AMS_CODEX\|AMS_OPENAI" docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md
827:- **`COLIBRI_MODEL_TIMEOUT`, not `AMS_MODEL_TIMEOUT`.** Donor namespace is forbidden.
The one hit is an intentional negative reference in P1.5.5’s Common
Gotchas (“use COLIBRI_MODEL_TIMEOUT, not AMS_MODEL_TIMEOUT”) — teaching
the executor that the donor namespace is forbidden. No live AMS_* var is
referenced as a valid env var anywhere in the file. Prompts use:
COLIBRI_KIMI_API_KEY,COLIBRI_KIMI_BASE_URL.COLIBRI_CODEX_API_KEY,COLIBRI_CODEX_BASE_URL.COLIBRI_OPENAI_API_KEY,COLIBRI_OPENAI_BASE_URL.COLIBRI_MODEL_TIMEOUT.
Pass.
A6 — 7-dimension scoring table matches concept doc
P1.5.1 embeds the exact 7-row dimension table with the same column order (Dimension, Formula, Default weight) and the same 7 dimensions named:
task_domain_match (0.20) · context_window_fit (0.15) · cost_efficiency
(0.15) · latency_fit (0.15) · reliability (0.15) · skill_match (0.15) ·
operator_preference (0.05).
Weights sum to 1.0. Tie-break order (reliability, cost, alphabetical)
reproduced verbatim. Worked example (Sonnet 0.87, GPT-4o 0.79, Haiku 0.58)
reproduced as the golden-path test vector in P1.5.1 acceptance.
Pass.
A7 — 8-model candidate cohort matches concept doc
P1.5.9 enumerates all 8 candidates from the concept doc §Phase 1.5 candidate cohort: Claude 3.5 Sonnet, Claude 3.5 Haiku, GPT-4o, GPT-4o mini, Gemini 1.5 Pro, Llama 3.3 70B, Mixtral 8x22B, Kimi K2. The audit doc §4 carries the full indicative-cost/latency table; the prompts file references it at the candidate-seed acceptance level.
Pass.
A8 — ζ decision-trail shape matches concept doc
P1.5.10 embeds the exact JSON shape from the concept doc §Decision-trail recording, all 8 fields verbatim:
type, routing_mode, chosen_model_id, candidates_considered, scores,
fallback_attempts, rule_version_hash, decision_hash
decision_hash = SHA-256(inputs || chosen) rule preserved; routing_mode
enum preserved (single | ensemble | pipeline | fail).
Pass.
A9 — 4 router_* tools in P1.5.7
$ grep -o "router_score\|router_call\|router_fallback\|router_stats" \
docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md \
| sort | uniq -c
11 router_call
10 router_fallback
8 router_score
16 router_stats
All 4 tools present; each referenced multiple times (in group summary, P1.5.7 block, and cross-references from P1.5.8 parity suite + P1.5.10 ζ emission).
Pass.
A10 — Circuit-breaker rules exact
P1.5.5 block states:
- Trip: 3 consecutive failures on the same
model_id. - Window: 60s cooldown.
- Reset: time-bound, per-model, not success-bound (explicit Common Gotcha warns against reading an older “resets on success” reference).
- Per-attempt timeout: 30s default via
COLIBRI_MODEL_TIMEOUT. - In-memory state (not DB-backed in this round).
Pass.
A11 — index.md updated with Phase 1.5 section
docs/guides/implementation/task-prompts/index.md now has:
## Phase 1.5 — δ Multi-Model Router (R91+, planned)
Once Phase 1 κ Rule Engine ships and the four ADR-005 §Implementation
trigger conditions hold, the prompts below graduate δ from Phase 0 library
stubs to multi-model routing. ...
| Group | File | Sub-tasks |
|-------|------|-----------|
| P1.5 — δ Router Graduation | [p1.5-delta-router-graduation.md](...) | P1.5.1 – P1.5.10 |
The 28-task Phase 0 table is untouched.
Pass.
A12 — ADR-005 Wave I postscript updated
Appended one line before the closing signature line:
**Phase 1.5 graduation prompts:** see
[`docs/guides/implementation/task-prompts/p1.5-delta-router-graduation.md`](...)
for the 10-sub-task plan (P1.5.1–P1.5.10) activating at R91+.
No other ADR content changed.
Pass.
A13 — No src/ changes
$ git status --porcelain | grep -E "^\s?[MA]\s+src/"
(no output)
Untracked / modified files are all under docs/. Zero src/ touches.
Pass.
A14 — Test gates
$ npm run build
> tsc
(clean; no output)
$ npm run lint
> eslint src
(clean; no output)
$ npm test
Test Suites: 1 failed, 25 passed, 26 total
Tests: 1 failed, 1084 passed, 1085 total
Time: 42.714 s
The one failure is the pre-existing startup.test.ts subprocess smoke flake
under full-suite load — confirmed non-regression by isolation run:
$ npm test -- --testPathPattern="startup.test.ts"
Test Suites: 1 passed, 1 total
Tests: 40 passed, 40 total
Time: 15.001 s
The same flake is documented in
C:\Users\Kamal\.claude\projects\E--AMS\memory\MEMORY.md
(“Pre-existing startup — subprocess smoke flakiness under full-suite load —
predates Wave H”) and is not introduced by this round’s docs-only edits.
Pass (with documented pre-existing flake).
2. Deliverable tree
All files are present in the worktree:
docs/
├── audits/r76-p2-delta-15-planning-audit.md ← step 1 (written)
├── contracts/r76-p2-delta-15-planning-contract.md ← step 2 (written)
├── packets/r76-p2-delta-15-planning-packet.md ← step 3 (written)
├── guides/implementation/task-prompts/
│ ├── p1.5-delta-router-graduation.md ← step 4 (NEW, 10 sub-tasks)
│ └── index.md ← step 4 (edited)
├── architecture/decisions/
│ └── ADR-005-multi-model-defer.md ← step 4 (1-line append)
└── verification/r76-p2-delta-15-planning-verification.md ← step 5 (this file)
3. Summary
- All 14 contract invariants hold.
npm run build && npm run lint && npm testexit statuses: 0 / 0 / 1 (1 pre-existing flake; 1084/1085 passing, same ase455b00abaseline — no regression).- 10 sub-task prompts ready for R91+ dispatch.
- ADR-005 now points readers directly at the graduation plan.
- No
src/modifications in this round.
R76.P2 complete. Proceeding to commit + PR per the task brief.
Written 2026-04-18 as step 5 of the R76.P2 5-step chain.