P0.2.3 — Two-Phase Startup — Contract
1. Intent
Introduce a single async orchestrator, startup(), that sequences the
P0.2.1 MCP transport and the P0.2.2 SQLite initializer into two visible
phases with graceful shutdown. Phase 1 runs fast and makes the MCP
handshake answerable; Phase 2 opens the database. Heavy work is deferred
until after the transport is live so the MCP client cannot time out, which
is the donor-bug-#4 mitigation P0.2.1 established and this task upholds.
This contract fixes the public surface, the guarantees each phase owes its caller, and the error / shutdown flows. It gates Step 3 (packet).
2. Public surface
2.1. New module src/startup.ts
export interface StartupOptions {
readonly createOptions?: CreateServerOptions;
readonly dbPath?: string; // defaults to config.COLIBRI_DB_PATH
readonly initDbFn?: (path: string) => Database.Database;
readonly closeDbFn?: () => void;
readonly bootstrapFn?: (opts: BootstrapOptions) => Promise<ColibriServerContext>;
readonly stopFn?: (ctx: ColibriServerContext) => Promise<void>;
readonly exit?: (code: number) => void;
readonly registerSignalHandlers?: boolean; // default true
readonly cleanupTimeoutMs?: number; // default 5000
readonly nowMs?: () => number;
readonly logger?: (...args: unknown[]) => void;
}
export interface StartupResult {
readonly ctx: ColibriServerContext;
readonly db: Database.Database;
readonly elapsedMs: number;
}
export function startup(options?: StartupOptions): Promise<StartupResult>;
export function shutdown(reason: string): Promise<void>;
/** @internal — test-only reset so each test gets a fresh module state. */
export function __resetForTests(): void;
Nothing else is exported. The contract pins this list.
2.2. src/server.ts delta (owned by this task)
Replace the bottom-of-file IIFE:
// BEFORE (P0.2.1)
if (isInvokedAsScript()) {
await bootstrap();
}
// AFTER (P0.2.3) — identical guard, swaps bootstrap() for startup()
if (isInvokedAsScript()) {
const { startup } = await import('./startup.js');
await startup();
}
No other lines of src/server.ts change. bootstrap() stays exported,
stays Phase-1-only, and is still used by the in-process bootstrap tests.
3. Phase invariants
3.1. Phase 1 — Transport (fast)
- Composes
bootstrapFn()(defaultbootstrapfrom./server.js). - On return:
ctx.server.isConnected() === true,server_pingis registered, and the[colibri] readylog has been emitted bystart(). - Database handle is not opened.
getDb()still throws. - Elapsed time at the Phase-1 boundary is recorded via
ctx.nowMs()(the same clock P0.2.1 uses so durations are comparable). - Any exception from
bootstrapFnpropagates out ofstartup(); the caller receives it. The injectedexitis not called at this point —bootstrap’s own internal catch already did that.
3.2. Phase 2 — Heavy init
- Runs only if Phase 1 resolved without throwing.
- Emits
[Startup] Phase 2: heavy-init...on stderr before work. - Calls
initDbFn(dbPath)exactly once. DefaultinitDbFn = initDbfrom./db/index.js. DefaultdbPath = config.COLIBRI_DB_PATH. - On success:
dbis the return value — the live handle.startup()resolves with{ ctx, db, elapsedMs }whereelapsedMsisMath.floor(nowMs() - phase1StartMs).- Emits
[Startup] Complete in <N>ms.
- On failure:
- Emits
[Startup] Phase 2 failed: <message>. - Invokes the internal
shutdown('phase-2-failed')which closes the transport and the DB (closeDbFn; safe no-op becauseinitDbcleans up its own handle on throw). - Calls the injected
exit(1). Ifexitdoes not terminate (tests), the promise rejects with the original error so the caller observes it.
- Emits
3.3. Re-entrancy guard
startup() sets a module-level startupInvoked boolean at the very top.
A second invocation throws:
Error('startup() already invoked')
Reset is ONLY possible via __resetForTests(). Production callers are
intentionally locked out — a second startup would leak signal handlers
and duplicate-register server_ping.
4. Shutdown contract
4.1. shutdown(reason: string): Promise<void>
- Idempotent. A second call while the first is in flight returns the same in-flight promise; a call after completion is a no-op.
- Emits
[Shutdown] <reason>on entry. - Closes the transport by awaiting
stopFn(ctx)(defaultstopfrom./server.js). The 5000 ms timeout races transport cleanup; on timeout, emits[Shutdown] Forced after 5000ms timeoutand continues. - Closes the DB via
closeDbFn()(defaultcloseDbfrom./db/index.js). Always safe — sync no-op when not initialized. - Removes the
SIGINTandSIGTERMlisteners this task’sstartup()installed. - Emits
[Shutdown] Cleanon success.
shutdown() never throws. Internal errors from stopFn are logged via
logger with prefix [Shutdown] stop failed:, swallowed, and cleanup
continues. Rationale: during a shutdown we want both halves (transport +
DB) to close regardless of which half threw.
4.2. Signal handling
Inside startup(), when registerSignalHandlers !== false:
- Register one-shot listeners on
SIGINTandSIGTERMpointing at:async (): Promise<void> => { try { await shutdown(`signal-${signalName}`); exit(0); } catch (err) { logger('[Shutdown] signal handler failed:', err); exit(1); } }; - Listeners are stored at module scope so
shutdown()can remove them viaprocess.off(...). This is R-3 from the audit: signal handlers are installed insidestartup(), never at module import time, so they do not leak into Jest workers.
4.3. Shutdown ordering
emit "[Shutdown] <reason>"
await Promise.race([stopFn(ctx), sleep(cleanupTimeoutMs)])
closeDbFn()
process.off('SIGINT', ...)
process.off('SIGTERM', ...)
emit "[Shutdown] Clean" | "[Shutdown] Forced after 5000ms timeout"
Transport first, DB second, signals last. Rationale: an inbound MCP call in flight should receive its response (or at least a transport-level error) before the DB handle it was reading from goes away.
5. Error modes
| Where | Condition | startup() behaviour |
Injected exit |
|---|---|---|---|
| Phase 1 | bootstrapFn throws BEFORE its internal catch |
Rethrows | NOT called by startup() (bootstrap already called it) |
| Phase 1 | bootstrapFn returns (its catch fired, it called exit internally) |
Resolves as if Phase 1 succeeded but ctx._registeredToolNames may be empty; Phase 2 still proceeds because Phase 1 returned. |
Called by bootstrap internally |
| Phase 2 | initDbFn throws |
Rejects with the original error after running shutdown('phase-2-failed') |
Called with code 1 |
| Shutdown | stopFn throws |
Logs, continues to closeDbFn |
Not called by shutdown itself |
| Shutdown | closeDbFn throws |
Logs, continues | Not called by shutdown itself |
| Signal | SIGINT / SIGTERM received |
Runs shutdown(signal) then exit(0) on success, exit(1) on error |
Called by the signal handler |
| Re-entry | startup() called twice |
Throws Error('startup() already invoked') |
Not called |
Note on the Phase-1-bootstrap-already-exited case: bootstrap() has
its own try/catch that invokes exit(1) on Phase-1 failure. In
production, exit is process.exit.bind(process) so the process is gone
— Phase 2 never runs. In tests with a fake exit, bootstrap returns
a ctx whose server.isConnected() === false; we still proceed to Phase
2 because the contract treats “bootstrap returned without throwing” as
Phase-1 success. Tests that want the full Phase-1-failed path inject
a bootstrapFn that rejects.
6. Log format
All via logger (default console.error):
| Message | When |
|---|---|
[Startup] Phase 1: transport... |
Before bootstrapFn() |
[Startup] Phase 1 ready |
After bootstrapFn() resolves |
[Startup] Phase 2: heavy-init... |
Before initDbFn(dbPath) |
[Startup] Complete in <N>ms |
After Phase 2 success |
[Startup] Phase 2 failed: <message> |
In the catch block |
[Startup] Aborted after <N>ms |
Appended after the “failed” line |
[Shutdown] <reason> |
Entry to shutdown() |
[Shutdown] stop failed: <err> |
When stopFn throws |
[Shutdown] close failed: <err> |
When closeDbFn throws |
[Shutdown] signal handler failed: <err> |
When a signal handler’s path throws |
[Shutdown] Forced after 5000ms timeout |
On the cleanup race timeout |
[Shutdown] Clean |
After clean cleanup |
Stderr only. Stdout is owned by the MCP stdio transport (donor bug #3 + S17).
7. Test seams
The StartupOptions interface is the sole dependency-injection seam.
Every default reads a binding exported by another module; every default
is overridable via the options bag. Test patterns required:
| Test path | Technique |
|---|---|
| Happy path | Inject fake bootstrapFn + initDbFn that resolve; assert StartupResult fields. |
| Phase 2 fails | Inject initDbFn that throws; assert shutdown ran, exit(1) was called, promise rejects. |
| Graceful SIGINT | Register signals, fire process.emit('SIGINT', 'SIGINT'), assert shutdown ran + exit(0). |
| Graceful SIGTERM | Same, with SIGTERM. |
| Re-entry guard | Call startup() twice, expect throw. Reset via __resetForTests. |
| Signal-handler install gated by option | Pass registerSignalHandlers: false, assert process.listenerCount unchanged. |
| Cleanup timeout | Inject a stopFn that never resolves with cleanupTimeoutMs: 50, assert [Shutdown] Forced log. |
| stopFn throws | Inject a throwing stopFn, assert [Shutdown] stop failed: log and DB still closed. |
| closeDbFn throws | Inject a throwing closeDbFn, assert log and shutdown completes. |
| idempotent shutdown | Call shutdown() twice in flight, assert stopFn called exactly once. |
__resetForTests() clears module state between tests:
export function __resetForTests(): void {
startupInvoked = false;
shutdownPromise = null;
activeCtx = null;
// Remove any still-installed signal listeners (defensive).
if (sigintHandler !== null) {
process.off('SIGINT', sigintHandler);
sigintHandler = null;
}
if (sigtermHandler !== null) {
process.off('SIGTERM', sigtermHandler);
sigtermHandler = null;
}
}
afterEach in the test file MUST call __resetForTests() so individual
tests do not leak state.
8. Invariants — enumerated
- I-1.
startup()MUST NOT callinitDbFnbeforebootstrapFnresolves. Verified byorder-of-callstest. - I-2.
startup()MUST emit Phase 1 logs BEFORE any Phase 2 log. Verified by log-order assertion in happy-path test. - I-3. On Phase 2 failure,
closeDbFnMUST be called EVEN IF the DB was never successfully opened. This is safe becausecloseDbis a no-op wheninitDbcleans up its own handle on throw. - I-4.
shutdown()MUST close the transport before the DB. - I-5.
shutdown()MUST NOT throw. Internal errors are logged and swallowed. - I-6.
startup()MUST reject with the original Phase 2 error when the injectedexitdoes not terminate (tests). In production withprocess.exit, the reject is unreachable. - I-7.
startup()MUST register AT MOST ONESIGINTlistener and AT MOST ONESIGTERMlistener, regardless of how many times signals fire. (Node dispatches the listener once per signal; the listener itself guards against reentrancy via theshutdown()idempotency.) - I-8.
registerSignalHandlers: falseMUST result in zero signal listeners added. - I-9.
bootstrap()andstart()fromsrc/server.tsremain exported unchanged. No test for this task modifiesserver.test.ts. - I-10. Coverage on
src/startup.ts≥ 90% branch. All exported functions exercised.__resetForTestsexercised implicitly viaafterEach.
9. Backward compatibility
bootstrap()as called bysrc/__tests__/server.test.tsis unchanged.- The
main() IIFE smoketest (server.test.ts describe 5) spawnstsx src/server.tswithNODE_ENV=testand asserts[colibri] startingon stderr. With the new tail callingstartup(), that log line is still emitted bystart()viabootstrap()viastartup(). The assertion passes without modification. - The
isInvokedAsScript() returns falsetest inserver.test.tsdescribe 8 asserts the import-only path is silent. Under the new tail, theif (isInvokedAsScript())guard is preserved byte-for-byte; the only change inside the guard isawait bootstrap()→await (await import('./startup.js')).startup(). Silence is preserved. package.json "main"remainssrc/server.ts.bin.colibriremainsdist/server.js. No packaging changes.
10. What this task explicitly does NOT do
- Does not register any new MCP tools.
- Does not modify the 5-stage middleware chain (P0.2.4 owns the 11-stage split).
- Does not add a database-health MCP tool (that is the P0.2.5 follow-up, already flagged in the Phase 0 task list).
- Does not wire domain registration. The
StartupResultexposesctxanddb; later P0.3+ tasks register their tools against the returned ctx. - Does not change
data/colibri.dbor any DB schema. - Does not modify
src/config.ts— theDATABASE_PATHreference in the spec file is stale (the canonical key isCOLIBRI_DB_PATH).
11. Exit criteria for step 2
StartupOptions,StartupResult,startup(),shutdown(),__resetForTests()defined (§2.1).- Exact edit to
src/server.tspinned (§2.2). - Phase 1 + Phase 2 invariants (§3).
- Shutdown contract with ordering + signal handling (§4).
- All error modes tabulated (§5).
- Log format enumerated (§6).
- Test seams enumerated (§7).
- Invariants I-1 through I-10 (§8).
- Backward compatibility confirmed against existing server tests (§9).
- Non-goals listed (§10).
Ready for step 3 (packet).