P0.6.2 Skills-Loader ESM Deadlock — Verification

Task: 109b31c3-6419-4019-a14a-2fb1fb26f355
Branch: feature/p0-6-2-skills-deadlock-fix @ origin/main 6345ba7a
Worktree: .worktrees/claude/p0-6-2-skills-deadlock-fix
Audit / Contract / Packet: see frontmatter.

ζ chain so far (this task):

  1. plan: 4b95bb8b80989ed8fd634c29646812ef119641a1ba285b7bd337b83e4cec3117 — initial commitment.
  2. analysis: 2e7e56f9399679560c88ce9e7b3ce93c9bc92dd6a5a88c1eeda724c4ebc6c47d — audit findings.
  3. decision: 189f598adfc7f7feb6170aa12e7079b064e948a44216066b7ac99104fa35adbf — implementation choices.
  4. reflection (final writeback): pending — recorded after this doc lands and before task_update(status="DONE") per CLAUDE.md §7.

1. Acceptance criteria — gate-by-gate

ID Criterion Result
A-1 tsc clean against patched source npm run build exit 0; postbuild copied 6 migrations
A-2 ESLint clean (npm run lint) lint exit=0; zero errors after curly-brace fix on the new test
A-3 Full Jest suite passes — npm test. No new failures, no regressions 1410 tests passed across 31 suites in 23.976 s. Pre-fix baseline was 1409; the +1 is the new regression test (A-7)
A-4 Standalone smoke prints [Startup] Skills: loaded=N before timeout ✅ See §2.1 below — prints loaded=23, skipped=0, pruned=0 and [Startup] Complete in 59ms
A-5 Live MCP server_health returns skill_count > 0 after Claude Desktop relaunch with patched dist 🟡 Pending deploy step (replace main dist/ + tray-quit + relaunch) — covered by todo #8
A-6 Live MCP skill_list returns total_count > 0 and populated skills[] 🟡 Pending same deploy step
A-7 Regression test exercises production injection path; FAILS on pre-fix code, PASSES on post-fix code ✅ See §2.2 below — subprocess test asserts total_count > 0 from a real node dist/server.js spawn

A-1 through A-4 and A-7 are satisfied within this worktree. A-5/A-6 require the user-controlled tray-quit + relaunch and are tracked separately as the deploy step.

2. Evidence

2.1. Standalone smoke (A-4)

Command (run from worktree root):

COLIBRI_DB_PATH=":memory:" \
  COLIBRI_SKILLS_ROOT="$(pwd)/.agents/skills" \
  COLIBRI_MODE=FULL \
  NODE_ENV=production \
  timeout 5 node dist/server.js < /dev/null

Output:

[Startup] Phase 1: transport...
[colibri] starting in mode= FULL version= 0.0.1
[colibri] ready
[Startup] Phase 1 ready
[Startup] Phase 2: heavy-init...
[colibri] skills loaded: 23, skipped: 0, pruned: 0
[Startup] Skills: loaded=23, skipped=0, pruned=0
[Startup] Complete in 59ms

The two log lines [colibri] skills loaded: … (from loadSkillsFromDisk’s logger) and [Startup] Skills: loaded=… (from startup.ts:301) are exactly the lines that never appeared in the pre-fix repro — confirming Phase 2 now completes instead of deadlocking. [Startup] Complete in 59ms is startup.ts’s post-Phase-2 timing log; it likewise never fired pre-fix.

2.2. Regression test (A-7)

src/__tests__/scripts/script-invocation-skills.test.ts — new, 152 lines.

It spawns node dist/server.js as a subprocess with COLIBRI_DB_PATH=:memory: and COLIBRI_SKILLS_ROOT pointing at the live .agents/skills/ corpus, then drives a real MCP initializetools/call skill_list handshake over stdio and parses the JSON-RPC reply. Asserts structuredContent.data.total_count > 0 and skills.length === total_count.

Test result, isolated:

=== TEST (new file only) ===
Test Suites: 1 passed, 1 total
Tests:       1 passed, 1 total
Snapshots:   0 total
Time:        13.315 s
test exit=0

Pre-fix failure mode (verified during the audit step): the subprocess hangs at Phase 2 forever; the skill_list handshake never completes; the test would time out at HANDSHAKE_TIMEOUT_MS=8000 ms with the diagnostic message "skill_list call timed out [...]; Phase 2 likely deadlocked". Post-fix: handshake completes in well under 1 second of Phase 2 wall time (the 13.3 s test wall is dominated by Jest start-up, not the spawn).

The test honors N-1 (it’s not just loadSkillsFromDisk in isolation — it spawns the real entry script) and the contract’s stated requirement that production exercise the same path as tests via options.loadSkillsFn injection.

2.3. Full test suite (A-3, no regressions)

=== FULL TEST ===
Test Suites: 31 passed, 31 total
Tests:       1410 passed, 1410 total
Snapshots:   0 total
Time:        23.976 s, estimated 32 s
test exit=0

Pre-fix baseline (per memory + R83 post-batch SHA 6345ba7a): 1409 passing across 30 suites. Post-fix: 1410 / 31 — exactly +1 test in +1 suite, matching the new src/__tests__/scripts/script-invocation-skills.test.ts. No existing test changed status.

2.4. Build (A-1)

=== BUILD ===
> colibri@0.0.1 build
> tsc

> colibri@0.0.1 postbuild
> node scripts/copy-migrations.mjs

copy-migrations: copied 6 migration(s) E:\AMS\.worktrees\claude\p0-6-2-skills-deadlock-fix\src\db\migrations -> E:\AMS\.worktrees\claude\p0-6-2-skills-deadlock-fix\dist\db\migrations

tsc produced no diagnostics; the postbuild script copied 6 migrations into dist/db/migrations/ as expected.

2.5. Lint (A-2)

=== LINT ===
> colibri@0.0.1 lint
> eslint src
lint exit=0

A first lint pass surfaced a one-line-if-without-curly-braces error inside the new regression test (script-invocation-skills.test.ts:82). Fixed in-place to use a block body — the project’s ESLint config enforces curly: 'all' and that rule has no exceptions. Re-run was clean.

3. Invariant audit (contract §2)

Invariant How verified
I-1 Production-test parity for the skill-load code path Production now goes through options.loadSkillsFn — the same seam tests use. Confirmed by reading server.ts:606 post-patch and startup.ts:296-298 resolution chain
I-2 No ESM dynamic imports during script-invocation top-level await loadSkillsFromDisk is now resolved via the static import on server.ts:50 (extended); no new dynamic imports added; startup.ts:298’s dynamic import is now unreachable from the script-invocation path
I-3 options.loadSkillsFn seam stays first-class Untouched in startup.ts:111 and startup.ts:296; both tests and production use it
I-4 Dynamic-import fallback at startup.ts:298 MAY remain as defense-in-depth Kept in place; comment block above it (startup.ts:281-291) updated to truthfully describe its status as defense-in-depth and document why production no longer reaches it
I-5 No new public API surface No new exports, no new env vars, no new MCP tools, no new error codes. Confirmed by git diff scope: only 3 source files modified + 1 test file added
I-6 Idempotent on repeat boots Smoke test re-run confirmed identical output (loaded=23, skipped=0, pruned=0 in both runs); loadSkillsFromDisk’s upsert+prune logic is unchanged

4. Risk register (contract §5) — post-implementation

Risk Status
Patch regresses Jest test that asserts await import fires when options.loadSkillsFn === undefined No such test exists in the suite (verified — full suite still green at 1410); the existing startup-suite tests inject options.loadSkillsFn themselves and never enter the dynamic-import branch
Regression test asserts on log strings, which are flaky Mitigated as planned — the new test asserts on parsed JSON-RPC total_count, not on stderr text. The “skill_list call timed out” diagnostic message in the test only appears if the test fails
Future contributor reintroduces dynamic import without realizing the deadlock Mitigated — anchor comment now lives at server.ts:600-619 citing the audit + contract by path with a DO NOT remove directive; the regression test would also catch reintroduction
Live deploy: replacing dist/ while Claude Desktop holds it open on Windows Open — handled in deploy step. Standard tray-quit + relaunch flow. Already exercised twice in this session for earlier config changes

5. What is NOT covered by this verification

  • A-5 / A-6 (live MCP): require deploying the patched dist/ to the user’s main checkout (or merging the PR and rebuilding from main), then a Claude Desktop tray-quit + relaunch. Out of scope for this verification doc — tracked as the deploy step.
  • Other domain modules (tasks/, trail/, proof/) also have static cycles back to server.ts (each domain repository imports registerColibriTool). They do not currently deadlock because none of them go through a startup.ts dynamic import — their tools register synchronously inside bootstrap(). This patch does not change that. If any future Phase 2 work added a await import('./domains/<other>/repository.js') it would hit the same deadlock; the inline anchor comment + this audit + contract pair flag the pattern for any such future contributor.
  • CHANGELOG.md or release-notes prose. Not in the Phase 0 doc-loop scope for a hotfix; if a hotfix release rolls up, the commit message in docs/packets/§5 is structured to extract verbatim.

6. Goes to writeback

The next steps, per CLAUDE.md §7:

  1. Commit on feature/p0-6-2-skills-deadlock-fix (audit + contract + packet + verification + 3 source edits + 1 test).
  2. mcp__colibri__thought_record with type: "reflection" recording the final state — task id, branch, worktree, commit SHA, tests, summary, blockers (none for this task; A-5/A-6 deploy is a follow-up).
  3. mcp__colibri__task_update with status: "DONE". The hard-block at src/domains/tasks/writeback.ts:97 will release because thought_records ≥ 1 exist for this task (we have 4 by then).
  4. Push the branch + (optional, T0 decision) open a PR.
  5. Deploy: replace main’s dist/ (or rebuild from main after merge) → user tray-quits + relaunches Claude Desktop → live mcp__colibri__server_health.skill_count: 23 and mcp__colibri__skill_list.total_count: 23 close A-5 + A-6.

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.