Contract — debug-startup-smoke-flake
Step 2 of the 5-step executor chain. Codifies the new behavioral contract for the test
startup — subprocess smoke › tsx src/server.ts boots and logs [Startup] Phase 1.
Purpose
src/__tests__/startup.test.ts describe block 7 (startup — subprocess smoke) verifies that running node --import tsx src/server.ts invokes the script-mode IIFE at src/server.ts:596–607, which in turn calls startup({ bootstrapFn, stopFn }) and emits the canonical Phase-1 log line. This is the only end-to-end coverage of the import.meta.url === pathToFileURL(arg1).href script-detection branch and the dynamic import('./startup.js') it triggers; it MUST remain in the test surface.
Behavioral contract (post-fix)
The test MUST pass deterministically under two conditions:
- Isolation:
npm test -- --testPathPattern=startup(1 worker, 1 file). - Full-suite parallel:
npm test(default Jest worker count — typically 7 on the dev box, varies by CI).
It MUST NOT pass via --runInBand only — that would mask the contention without fixing the underlying budget gap. (Note: this contract does not forbid --runInBand, it just states the test must pass without it.)
The test MUST exercise the same end-to-end path it currently does:
- A real subprocess launched via
spawnSync(process.execPath, ['--import', 'tsx', SERVER_MODULE_FS_PATH], …). tsxESM loader engaged.src/server.tsevaluated as a script (not imported).[colibri] startingand[Startup] Phase 1log lines emitted to stderr and observed.
The test MUST NOT:
- Be marked
.skip()or.todo(). - Replace the subprocess invocation with an in-process equivalent (per task constraints).
- Modify
src/server.ts,src/startup.ts, orsrc/db/index.ts(per task constraints — no production code changes are warranted by the audit).
Timing budget
spawnSync({ timeout }) is the binding constraint on success and MUST be wide enough to comfortably accommodate cold tsx start + Windows AV scanning + ~7-worker Jest contention.
Concrete budget (derived from the audit’s standalone-cold measurement of 4205 ms + ~3× safety margin for parallel-load amplification):
STARTUP_SUBPROCESS_TIMEOUT_MS = 30_000ms — spawnSync deadline.- Jest test timeout =
STARTUP_SUBPROCESS_TIMEOUT_MS + 5_000=35_000ms — outer safety net so the inner spawnSync timeout stays the binding constraint and a violation surfaces as a meaningful diagnostic instead of “Jest test timed out”.
Both values MUST be defined as named constants (not magic numbers) and MUST be paired with a comment naming the rationale (cold tsx + Windows + parallel Jest load).
Diagnostic enrichment
When the test fails, the failure message MUST include:
- The full captured
stderr(truncated to a reasonable cap — say 4 KB). result.signalif the child was killed.result.statusandresult.error?.messageif either is set.
Rationale: a future regression (tsx cold-start exceeding 30 s; or a real production-side hang inside bootstrap() / start()) must be diagnosable from the CI log alone. The current bare expect(stderr).toMatch(...) discards the actionable detail.
Pass criteria for verification
To gate this contract as fulfilled:
- 5 sequential
npm testruns, each starting from a state representative of normal contention (no special flags), MUST produce 5/5 green runs. - A single 10-run pass MAY be additionally captured for stronger evidence; flake rate MUST be 0 % across that pass.
npm run buildMUST succeed (TypeScript type-checks the test edits).npm run lintMUST succeed (ESLint validates the test edits).
Out of scope
- Changing the production startup ordering (Phase 1 vs Phase 2, log emission order). The current ordering is correct; the audit confirmed it.
- Replacing
tsxwith a precompileddist/server.jsinvocation. While that would eliminate the cold-start cost, it adds a TypeScript build step before testing — a much larger change than the timing fix and a separate decision. - Tuning Jest worker count globally. The other 1356 tests are happy with the default; this one test should not dictate suite-wide concurrency policy.
- Improving
tsxcold-start time. That is upstream work in thetsxproject.