P0.4.2 Graceful Shutdown — Audit

1. Objective

Inventory the existing shutdown-related surface in src/startup.ts and decide which implementation path to follow for the registerShutdownHandler API + src/shutdown.ts public module required by the task acceptance criteria.


2. Existing shutdown surface in src/startup.ts

2.1. Module-scope state (lines 130–160)

Symbol Type Purpose
shutdownPromise Promise<void> \| null Idempotency guard — second shutdown() call returns the same promise
activeCtx ColibriServerContext \| null Published by startup() after Phase 1; read by shutdown()
activeOptions ActiveOptions \| null Carries stopFn, closeDbFn, logger, exit, cleanupTimeoutMs for signal handlers
sigintHandler NodeJS.SignalsListener \| null Owned listener so shutdown() can remove it
sigtermHandler NodeJS.SignalsListener \| null Owned listener so shutdown() can remove it

2.2. shutdown(reason: string): Promise<void> (lines 322–396)

Exported public function. Fully implemented:

  • Idempotent: returns shutdownPromise if already in flight (line 323-325)
  • Step 1 — transport (lines 345-369): calls stopFn(ctx) via Promise.race with a cleanupTimeoutMs (default 5000 ms) timeout. Sets forced = true on timeout. Swallows stopFn errors (logged as [Shutdown] stop failed:).
  • Step 2 — DB (lines 371-377): calls closeDbFn(). Swallows errors (logged as [Shutdown] close failed:).
  • Step 3 — signals (lines 379-387): removes sigintHandler / sigtermHandler.
  • Emits [Shutdown] Forced after 5000ms timeout if forced (line 390).
  • Emits [Shutdown] Clean on completion (line 392).

2.3. gracefulSignalExit(signalName: string): Promise<void> (lines 412–423)

Private async function invoked by the signal handlers registered during Phase 1:

  • Calls shutdown(signal-${signalName}), then exit(0).
  • On unexpected throw: logs [Shutdown] signal handler failed:, calls exit(1).

2.4. Signal handler registration (lines 242–251)

Installed inside startup() after Phase 1 succeeds, conditional on registerSignalHandlers !== false. Uses process.on('SIGINT', ...) / process.on('SIGTERM', ...). Both invoke gracefulSignalExit.

2.5. StartupOptions DI seams (lines 84–117)

All shutdown-related dependencies are injectable for testing:

  • stopFn, closeDbFn, exit, cleanupTimeoutMs, logger.
  • Tests use registerSignalHandlers: false to prevent listener leakage into Jest.

2.6. Test surface (src/__tests__/startup.test.ts — 950 lines)

  • describe 3 (“shutdown — contract”): 9 tests covering idempotency, error swallowing, ordering, timeout, [Shutdown] Clean, pre-startup call.
  • describe 4 (“signal handlers”): 8 tests covering registration, deregistration, SIGINT/SIGTERM → exit(0), reset teardown, defense-in-depth exit(1).

3. Gap analysis against P0.4.2 acceptance criteria

AC Status Gap
registerShutdownHandler(fn): registers a cleanup function Missing Not implemented anywhere
SIGINT/SIGTERM: handlers called in reverse registration order Missing No handler list; existing shutdown calls stopFn/closeDbFn directly
DB connection closed before process exit Done closeDbFn() called in step 2 of shutdown()
In-flight MCP requests complete (max 5s timeout) Done Promise.race in shutdown() step 1
Exit code 0 clean, 1 error Done gracefulSignalExit calls exit(0) / exit(1)
src/shutdown.ts module Missing No such file
src/__tests__/shutdown.test.ts Missing No such file

Net: 2 of 7 criteria are fully met; 2 partially met; 3 missing.


4. Inline-vs-extract decision

Candidate paths

Path (a) — Full refactor: extract core shutdown into src/shutdown.ts, leaving startup.ts to import and delegate. Cleanest separation but requires touching every export/import reference and risks regressions in 950 lines of startup tests.

Path (b) — Inline + thin wrapper: keep core logic in startup.ts, create src/shutdown.ts as a small public surface re-exporting registerShutdownHandler and gracefulShutdown and delegating to internals.

Path (c) — Augment inline + barrel: enhance the existing inline shutdown to meet all AC (add handler list + LIFO invocation), expose registerShutdownHandler from startup.ts, create src/shutdown.ts as a re-export-only barrel module.

Decision: Path (c)

Rationale:

  1. Wave A lock says “prefer 5-stage middleware INLINE pattern” — the shutdown machinery is integral to the startup orchestrator, not a standalone service.

  2. Risk surface: startup.ts already has 950 lines of tests covering the shutdown path. A full refactor risks subtle regressions. Path (c) adds minimal new coupling.

  3. Minimal scope: the task requires adding one new exported function (registerShutdownHandler) and calling those handlers in LIFO order inside the existing shutdown(). This is a ~20-line delta to startup.ts.

  4. src/shutdown.ts spec compliance: a barrel module satisfies the letter of the task spec (the file exists, the public API lives there) while keeping the real implementation co-located with the transport/DB lifecycle machinery that owns it.

  5. Test isolation: src/__tests__/shutdown.test.ts can test registerShutdownHandler through the barrel’s re-exports without depending on startup.ts internals directly.


5. Planned changes summary

src/startup.ts additions

  • Module-scope handlers: Array<() => Promise<void> | void> = [] — LIFO handler list.
  • Export registerShutdownHandler(fn) — appends to handlers; returns deregister fn.
  • Export clearShutdownHandlers() — test-only reset (called from __resetForTests).
  • Inside shutdown() async body: after closeDbFn() step, iterate handlers in reverse and call each; swallow per-handler errors (same pattern as stopFn/closeDbFn).

src/shutdown.ts (new)

Barrel module re-exporting registerShutdownHandler and shutdown from ./startup.js. No implementation logic. Documents its status as a public surface.

src/__tests__/shutdown.test.ts (new)

Tests covering:

  1. registerShutdownHandler registers and calls handler on shutdown
  2. LIFO order: last-registered runs first
  3. Idempotency: second shutdown() call does not re-run handlers
  4. Handler throwing does not abort remaining handlers
  5. Handler throwing causes exit(1) via signal path? (no — swallowed; exit(1) only on gracefulSignalExit catch, which requires shutdown() itself to throw)
  6. SIGTERM → registerShutdownHandler callback called
  7. DB close happens after handlers (handlers before closeDbFn)
  8. clearShutdownHandlers resets state (test isolation)
  9. Multiple handlers: all called
  10. Deregister function removes a specific handler

6. Files touched

File Action
src/startup.ts Modify — add handler list + registerShutdownHandler + clearShutdownHandlers; extend shutdown() to call handlers
src/shutdown.ts Create — barrel re-export
src/__tests__/shutdown.test.ts Create — new test suite (≥10 tests)

No changes to: src/server.ts, src/db/, src/modes.ts, src/domains/, test files other than the new one.


7. Risk register

Risk Mitigation
Regression in existing startup.test.ts Handler list starts empty; existing tests never call registerShutdownHandler, so behavior is unchanged
LIFO order semantics confusion Test confirms reverse iteration with explicit index ordering assertion
Per-handler errors swallowed silently Logged with [Shutdown] handler failed: prefix; closeDbFn still runs
Handler list not reset between tests __resetForTests() extended to call clearShutdownHandlers()
Deregister function after shutdown Documented as no-op — once shutdown fires, deregister is meaningless

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.