Contract — debug-startup-smoke-flake

Step 2 of the 5-step executor chain. Codifies the new behavioral contract for the test startup — subprocess smoke › tsx src/server.ts boots and logs [Startup] Phase 1.

Purpose

src/__tests__/startup.test.ts describe block 7 (startup — subprocess smoke) verifies that running node --import tsx src/server.ts invokes the script-mode IIFE at src/server.ts:596–607, which in turn calls startup({ bootstrapFn, stopFn }) and emits the canonical Phase-1 log line. This is the only end-to-end coverage of the import.meta.url === pathToFileURL(arg1).href script-detection branch and the dynamic import('./startup.js') it triggers; it MUST remain in the test surface.

Behavioral contract (post-fix)

The test MUST pass deterministically under two conditions:

  1. Isolation: npm test -- --testPathPattern=startup (1 worker, 1 file).
  2. Full-suite parallel: npm test (default Jest worker count — typically 7 on the dev box, varies by CI).

It MUST NOT pass via --runInBand only — that would mask the contention without fixing the underlying budget gap. (Note: this contract does not forbid --runInBand, it just states the test must pass without it.)

The test MUST exercise the same end-to-end path it currently does:

  • A real subprocess launched via spawnSync(process.execPath, ['--import', 'tsx', SERVER_MODULE_FS_PATH], …).
  • tsx ESM loader engaged.
  • src/server.ts evaluated as a script (not imported).
  • [colibri] starting and [Startup] Phase 1 log lines emitted to stderr and observed.

The test MUST NOT:

  • Be marked .skip() or .todo().
  • Replace the subprocess invocation with an in-process equivalent (per task constraints).
  • Modify src/server.ts, src/startup.ts, or src/db/index.ts (per task constraints — no production code changes are warranted by the audit).

Timing budget

spawnSync({ timeout }) is the binding constraint on success and MUST be wide enough to comfortably accommodate cold tsx start + Windows AV scanning + ~7-worker Jest contention.

Concrete budget (derived from the audit’s standalone-cold measurement of 4205 ms + ~3× safety margin for parallel-load amplification):

  • STARTUP_SUBPROCESS_TIMEOUT_MS = 30_000 ms — spawnSync deadline.
  • Jest test timeout = STARTUP_SUBPROCESS_TIMEOUT_MS + 5_000 = 35_000 ms — outer safety net so the inner spawnSync timeout stays the binding constraint and a violation surfaces as a meaningful diagnostic instead of “Jest test timed out”.

Both values MUST be defined as named constants (not magic numbers) and MUST be paired with a comment naming the rationale (cold tsx + Windows + parallel Jest load).

Diagnostic enrichment

When the test fails, the failure message MUST include:

  • The full captured stderr (truncated to a reasonable cap — say 4 KB).
  • result.signal if the child was killed.
  • result.status and result.error?.message if either is set.

Rationale: a future regression (tsx cold-start exceeding 30 s; or a real production-side hang inside bootstrap() / start()) must be diagnosable from the CI log alone. The current bare expect(stderr).toMatch(...) discards the actionable detail.

Pass criteria for verification

To gate this contract as fulfilled:

  • 5 sequential npm test runs, each starting from a state representative of normal contention (no special flags), MUST produce 5/5 green runs.
  • A single 10-run pass MAY be additionally captured for stronger evidence; flake rate MUST be 0 % across that pass.
  • npm run build MUST succeed (TypeScript type-checks the test edits).
  • npm run lint MUST succeed (ESLint validates the test edits).

Out of scope

  • Changing the production startup ordering (Phase 1 vs Phase 2, log emission order). The current ordering is correct; the audit confirmed it.
  • Replacing tsx with a precompiled dist/server.js invocation. While that would eliminate the cold-start cost, it adds a TypeScript build step before testing — a much larger change than the timing fix and a separate decision.
  • Tuning Jest worker count globally. The other 1356 tests are happy with the default; this one test should not dictate suite-wide concurrency policy.
  • Improving tsx cold-start time. That is upstream work in the tsx project.

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.