Contract — P1.5.9 Model Candidates Table Population

Task: P1.5.9 — Phase 1.5 graduation Wave 1 (foundation). Branch: feature/p1-5-9-candidates. Audit reference: p1-5-9-candidates-audit.md. Spec source: docs/3-world/social/llm.md §”The candidate table” + §”Phase 1.5 candidate cohort”.

1. Table shape

The migration 009_model_candidates.sql creates one table:

CREATE TABLE IF NOT EXISTS mcp_model_candidates (
  model_id               TEXT PRIMARY KEY,
  provider               TEXT NOT NULL,
  context_window_tokens  INTEGER NOT NULL,
  latency_tier           TEXT NOT NULL CHECK (latency_tier IN ('fast','balanced','slow')),
  cost_bps_per_kilotoken INTEGER NOT NULL CHECK (cost_bps_per_kilotoken >= 0),
  domain_fit_profile     INTEGER NOT NULL CHECK (domain_fit_profile >= 0 AND domain_fit_profile <= 255),
  enabled                INTEGER NOT NULL DEFAULT 0 CHECK (enabled IN (0, 1))
);

Notes on column choices:

  • No FK on provider. There is no mcp_providers lookup table in Phase 0/1/2; following the same pattern as reputations.node_id (no FK to a ξ identity table).
  • context_window_tokens is bounded only by SQLite’s signed-64-bit INTEGER range. No upper-bound CHECK because the spec explicitly allows future million-token windows (Gemini 1.5 Pro = 1,000,000).
  • latency_tier CHECK enforces the three-value vocabulary from the concept doc §”candidate table”.
  • cost_bps_per_kilotoken ≥ 0 because cost is non-negative; no upper bound because the spec authorises post-task callback updates that may push values arbitrarily high.
  • domain_fit_profile ∈ [0, 255] because it is an 8-bit bitmask over the 8 ξ domains (see audit §3.4 and identity.md §”The eight domains”).
  • enabled is INTEGER 0/1 because SQLite has no native boolean type — same pattern as mcp_witnesses.weight_cap / reputations.score (other INTEGER-CHECK columns in the codebase).

The schema CREATE uses IF NOT EXISTS so the migration is self-idempotent if re-executed against an already-migrated DB.

2. Seed rows

The migration seeds 8 rows via INSERT OR IGNORE. INSERT OR IGNORE means: if a row with the same model_id (primary key) already exists, do nothing. Combined with the IF NOT EXISTS table create, this makes the entire migration body re-runnable without error.

2.1. Row table (post-migration state)

# model_id provider context_window_tokens latency_tier cost_bps_per_kilotoken domain_fit_profile (hex) domain_fit_profile (dec) enabled
1 claude-sonnet-3-5 anthropic 200000 balanced 300 0x8B 139 1
2 claude-haiku-3-5 anthropic 200000 fast 80 0x42 66 0
3 gpt-4o openai 128000 balanced 250 0x23 35 0
4 gpt-4o-mini openai 128000 fast 15 0x41 65 0
5 gemini-1-5-pro google 1000000 slow 125 0xA2 162 0
6 llama-3-3-70b meta 128000 balanced 50 0x91 145 0
7 mixtral-8x22b mistral 64000 fast 60 0x05 5 0
8 kimi-k2 moonshot 200000 balanced 120 0x49 73 0

2.2. model_id slug rules

  • Kebab-case.
  • Version is embedded with hyphens (3-5 not 3.5).
  • No provider prefix — the provider column carries that information.
  • Matches the staging file’s row table at L1328-L1334 exactly for the 7 new rows.
  • The Claude Sonnet row’s model_id = "claude-sonnet-3-5" matches the concept doc’s worked example at llm.md:108 ("Claude Sonnet" in prose; kebab-cased here).

2.3. domain_fit_profile bitmask layout

Bits map to the 8 ξ domains per identity.md §”The eight domains”:

Bit Mask Domain
0 0x01 BUILDER
1 0x02 JUDGE
2 0x04 INVESTOR
3 0x08 MENTOR
4 0x10 GUARDIAN
5 0x20 INNOVATOR
6 0x40 DIPLOMAT
7 0x80 STEWARD

Per-model bitmask justifications (mapping concept-doc “strengths” to the 8 domains):

  • Claude 3.5 Sonnet (0x8B): code review (BUILDER bit 0 + JUDGE bit 1), structured output / mentorship (MENTOR bit 3), long-context reasoning over archives (STEWARD bit 7). 1 + 2 + 8 + 128 = 139.
  • Claude 3.5 Haiku (0x42): classification / triage (JUDGE bit 1), high-throughput communication (DIPLOMAT bit 6). 2 + 64 = 66.
  • GPT-4o (0x23): multimodal authoring (BUILDER bit 0), general reasoning judgement (JUDGE bit 1), novel-task innovation (INNOVATOR bit 5). 1 + 2 + 32 = 35.
  • GPT-4o mini (0x41): simple-task authoring (BUILDER bit 0), cheap-fallback messaging (DIPLOMAT bit 6). 1 + 64 = 65.
  • Gemini 1.5 Pro (0xA2): huge-context judgement (JUDGE bit 1), exploratory synthesis (INNOVATOR bit 5), archival curation (STEWARD bit 7). 2 + 32 + 128 = 162.
  • Llama 3.3 70B (0x91): customizable authoring (BUILDER bit 0), data-sovereign enforcement (GUARDIAN bit 4), long-term self-hosted curation (STEWARD bit 7). 1 + 16 + 128 = 145.
  • Mixtral 8x22B (0x05): code-heavy authoring (BUILDER bit 0), permissive-license resource allocation (INVESTOR bit 2). 1 + 4 = 5.
  • Kimi K2 (0x49): authoring (BUILDER bit 0), cross-lingual mentorship (MENTOR bit 3), CN/EN parity diplomacy (DIPLOMAT bit 6). 1 + 8 + 64 = 73.

All values are non-zero (so scoring’s domain ∈ profile check has at least one bit to match) and pairwise distinct (so no two models share a profile, which would degenerate the tie-breaker into pure alphabetical sort). The values are deterministic and reproducible.

2.4. cost_bps_per_kilotoken choice rationale

Unit: basis points per 1000 tokens. 1 bps = 0.0001 USD. So 300 bps/kilotoken = $0.030/1k = $30/1M tokens. The values lie inside the publicly published rate-card ranges for each model as of 2026-05; per the spec they are indicative and authoritative values arrive via the post-task callback.

Bucket (concept doc) bps Models
high 300 Claude 3.5 Sonnet
high 250 GPT-4o
medium 125 Gemini 1.5 Pro
medium 120 Kimi K2
low 80 Claude 3.5 Haiku
low (self-host) 50 Llama 3.3 70B
low 60 Mixtral 8x22B
low (frontier mini) 15 GPT-4o mini

Distinct numeric values per row (no ties); inside the concept-doc qualitative buckets. The two “high”s differ (300 vs 250) only enough to keep the rows pairwise distinct on this column.

2.5. enabled semantics

  • enabled = 1 on exactly one row (claude-sonnet-3-5). This row is the Phase 0 single-member fallback chain; it’s the active model before P1.5.2–P1.5.4 adapters land.
  • enabled = 0 on the 7 other rows. The router scoring layer (P1.5.1) reads only enabled rows, so until each adapter ships and flips its row’s enabled to 1, the model is invisible to scoring.
  • Flipping enabled is a future per-adapter operation, not part of this migration.

Rationale for shipping Claude Sonnet at enabled = 1: the audit established that no migration before 009 creates the table, so this migration is responsible for putting the Phase 0 active row into a clean state. P1.5.1’s scoring stub still returns claude for every input (see Phase 0 invariants I1/I2 in scoring.ts); the table contents do not change that, because Phase 0’s scoring does not read the table.

3. Idempotency contract

The migration body is idempotent at two layers:

  1. Migration-runner layer. PRAGMA user_version = 9 after a successful run. Subsequent boots skip the file (see index.ts migration loop: if (m.version <= currentVersion) continue;).
  2. Statement layer. Within the file itself:
    • CREATE TABLE IF NOT EXISTS ... — no-op if table exists.
    • INSERT OR IGNORE INTO mcp_model_candidates ... — no-op per row if the primary key already exists.

These two layers compose: even if an operator manually re-runs the SQL against an already-migrated DB (e.g. via sqlite3 colibri.db < 009.sql), the file produces no errors and no row duplication. The verification step asserts this with an idempotency test.

4. Reversibility contract

The migration carries its DOWN SQL as a -- DOWN commented block at the bottom of the file:

-- DOWN (manual; the runner is forward-only):
--   DELETE FROM mcp_model_candidates WHERE model_id IN (
--     'claude-sonnet-3-5', 'claude-haiku-3-5', 'gpt-4o', 'gpt-4o-mini',
--     'gemini-1-5-pro', 'llama-3-3-70b', 'mixtral-8x22b', 'kimi-k2'
--   );
--   DROP TABLE IF EXISTS mcp_model_candidates;
--   PRAGMA user_version = 8;

This follows the convention established by 002 / 007 / 008 (DOWN as a commented-out block in the migration file; no programmatic rollback in Phase 0/1/2).

5. Test obligations (Step 5 will verify)

Tests live at src/__tests__/db/migrations/009_model_candidates.test.ts. The test opens a fresh temp DB via initDb, then asserts:

  1. Table exists. SELECT name FROM sqlite_master WHERE type='table' AND name='mcp_model_candidates' returns one row.
  2. 8 rows total. SELECT COUNT(*) FROM mcp_model_candidates = 8.
  3. All 8 expected model_ids present. Each of the 8 model_id slugs returns exactly 1 row when queried by PK.
  4. Exactly Claude Sonnet is enabled. SELECT model_id FROM mcp_model_candidates WHERE enabled = 1 returns [{model_id: 'claude-sonnet-3-5'}].
  5. 7 rows are disabled. SELECT COUNT(*) FROM mcp_model_candidates WHERE enabled = 0 = 7.
  6. Column values match the contract. For each row, the values in columns provider, context_window_tokens, latency_tier, cost_bps_per_kilotoken, domain_fit_profile match this contract’s §2.1 table exactly.
  7. latency_tier CHECK enforced. INSERT INTO ... latency_tier = 'instant' throws.
  8. enabled CHECK enforced. INSERT INTO ... enabled = 2 throws.
  9. domain_fit_profile out-of-range CHECK enforced. INSERT INTO ... domain_fit_profile = 256 throws.
  10. Idempotency. Re-applying the migration body (as captured by fs.readFileSync of the migration file) against the migrated DB succeeds and the row count is still 8. (We re-run the SQL inside the test by reading the file’s body, stripping comments, and calling db.exec.)
  11. PRAGMA user_version is bumped to 9. After initDb, the DB reports user_version = 9 (since adding migration 009 makes that the highest applied version).

Additionally the pre-existing db-init.test.ts:166 assertion (expect user_version === countRealMigrations()) auto-tracks the new file because countRealMigrations() reads the directory at runtime.

6. Invariants (non-negotiable)

  • I1. Exactly 8 rows post-migration.
  • I2. Exactly one row has enabled = 1 and its model_id = 'claude-sonnet-3-5'.
  • I3. Re-running the migration body is idempotent (row count stays at 8, no errors).
  • I4. domain_fit_profile is in [0, 255] and non-zero for every row.
  • I5. latency_tier ∈ {'fast', 'balanced', 'slow'} for every row.
  • I6. cost_bps_per_kilotoken ≥ 0 for every row.
  • I7. The migration file uses only -- line comments (no block comments — the runner does not strip those).
  • I8. No Phase 0 code path mutates the table. Confirmed in audit §3.1 (zero grep hits) and reinforced by leaving src/domains/router/ untouched in this task.
  • I9. Forward-compat: model_id is the table’s PK; future migrations may flip enabled rows on individually as adapters land, without re-creating the table.

7. Out of scope (for P1.5.9)

  • ❌ Scoring logic (P1.5.1).
  • ❌ Model adapters / routeRequest (P1.5.2 / .3 / .4).
  • ❌ Cost-USD computation (P1.5.5).
  • ❌ MCP tool surface (router_score, router_call, …) (P1.5.7).
  • ❌ ζ Decision-trail emission (P1.5.10).
  • ❌ Mutating src/domains/router/scoring.ts or fallback.ts — the Phase 0 invariants on those modules (I1–I7 in scoring.ts) are not modified by this task.
  • ❌ Adding any MCP tool.

8. Files this contract authorises

Action Path
CREATE src/db/migrations/009_model_candidates.sql
CREATE src/__tests__/db/migrations/009-model-candidates.test.ts
CREATE docs/audits/p1-5-9-candidates-audit.md (Step 1, done)
CREATE docs/contracts/p1-5-9-candidates-contract.md (this file)
CREATE docs/packets/p1-5-9-candidates-packet.md (Step 3)
CREATE docs/verification/p1-5-9-candidates-verification.md (Step 5)

Anything outside this list is out of scope. In particular, no source file in src/db/, src/domains/router/, or anywhere else is modified in Phase 1 (P1.5.9’s commit will be CREATE-only).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.