Audit — R82.H docs/2-plugin/health.md vs live server_health payload
Scope. Non-mutating inventory of the drift between
docs/2-plugin/health.mdand the shippedserver_healthhandler. Lists every field, parameter, and failure-mode claim documented but not actually produced by the Phase 0 code. Commit this file before the contract.
Target file: docs/2-plugin/health.md (121 lines, 7 kB).
Live implementation: src/tools/health.ts (175 lines).
Tool registration: src/server.ts:555 (registerHealthTool(ctx) after server_ping).
Task spec of record: docs/guides/implementation/task-breakdown.md §P0.2.4.
Manifest slice: R82 r82-phase-0-1-stabilization/manifest.md row H.
1. Live payload — exact source
The handler is buildHealthPayload(ctx: ColibriServerContext): HealthPayload at
src/tools/health.ts:130-139:
export function buildHealthPayload(ctx: ColibriServerContext): HealthPayload {
return {
status: 'ok',
version: ctx.version,
uptime_ms: Math.floor(ctx.nowMs() - ctx.bootStartMs),
db_tables: countTables(ctx.db),
phase: ctx.phase ?? 'phase1',
mode: ctx.mode,
};
}
The payload Zod schema is pinned at src/tools/health.ts:72-79:
export const healthPayloadSchema = z.object({
status: z.literal('ok'),
version: z.string().min(1),
uptime_ms: z.number().int().nonnegative(),
db_tables: z.number().int().nonnegative(),
phase: z.enum(['phase1', 'phase2']),
mode: z.enum(RUNTIME_MODES),
});
RUNTIME_MODES (src/modes.ts:31) = ['FULL', 'READONLY', 'TEST', 'MINIMAL'].
1.1 Live payload shape (ground truth for the contract)
| Field | Type | Source | Notes |
|---|---|---|---|
status |
literal "ok" |
src/tools/health.ts:132 + schema L73 |
Pinned as z.literal('ok') — handler never throws, never returns “degraded” / “error”. The manifest’s T0 drift-inventory line lists the six field names only; it does not assert a multi-value status enum. |
version |
non-empty string | ctx.version at L133 |
Populated from package.json.version at createServer time (server.ts:201-215). |
uptime_ms |
non-negative integer | Math.floor(ctx.nowMs() - ctx.bootStartMs) at L134 |
performance.now() delta, floored to integer ms. |
db_tables |
non-negative integer | countTables(ctx.db) at L135 |
Counts user tables via SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'. Returns 0 when ctx.db is undefined (Phase 1) or query throws. |
phase |
"phase1" | "phase2" |
ctx.phase ?? 'phase1' at L136 |
phase1 during transport-only boot; phase2 after startup.ts opens the DB. |
mode |
"FULL" | "READONLY" | "TEST" | "MINIMAL" |
ctx.mode at L137 |
From detectMode(process.env) or CreateServerOptions.mode. |
1.2 Live input schema
src/tools/health.ts:57:
const inputSchema = z.object({});
The tool takes zero arguments. No detail parameter, no enum. Non-object
args are rejected by stage-2 schema-validate middleware with the standard
INVALID_PARAMS envelope (covered by the describe 3 → "rejects non-object
args at stage 2" test at src/__tests__/tools/health.test.ts:388-411).
1.3 Live failure posture
src/tools/health.ts:20-27 is explicit: the handler never throws.
ctx.dbundefined / closed / locked →db_tables: 0(viacountTablestry/catch at L104-111).- No
SAFE_MODE/DIAGNOSEmode flip inside the health handler —modeis a pure read-through ofctx.mode, which was fixed atcreateServertime. - No 30-second
setIntervalcheck loop exists in the Phase 0 codebase (seegrep -rn "setInterval" src/tools/ src/server.ts src/startup.ts→ zero hits in health-related code). - No event-loop lag probe, watcher-stall probe, or queue-depth probe.
- The
statusfield is the literal"ok". There is no degraded-state reporting pathway in the handler.
2. Documented surface — inventory of docs/2-plugin/health.md
File is 121 lines. Sections and their drift status:
| Line range | Section | Drift |
|---|---|---|
| L1-7 | Jekyll frontmatter | OK — preserve. |
| L8-13 | Header + ADR-004 + src/domains/system/tools.ts path |
Drift. Live path is src/tools/health.ts, not src/domains/system/tools.ts. The 19-tool phantom surface is also stale (Phase 0 ships 14 tools per R82 manifest + CLAUDE.md §10). |
| L14 | “Why MCP, not HTTP” | OK — true statement, no drift. |
| L16-27 | Tool contract — inputs + detail=summary\|full schema example |
Drift. Live tool takes no arguments. detail does not exist in the input schema. The schema example cites “Zod 4” — actual is Zod v3.23 per CLAUDE.md §1. |
| L29-52 | Output shape (summary) JSON | Drift — entire block is fictional. Uses server_version (not version), uptime_seconds (not uptime_ms), database.{path,reachable,integrity} object, memory.{heap_used_mb,heap_total_mb,rss_mb}, checks.{sqlite_integrity,memory_threshold,event_loop_lag_ms}. Zero of these field names appear in the live payload. |
| L54 | Output shape (full) — “summary fields plus 20 entries of each check’s history” | Drift — no detail param; no history ring buffer. |
| L56-65 | Failure modes table (SAFE_MODE, DIAGNOSE, etc.) |
Drift — none of these modes are written by the handler. SAFE_MODE / DIAGNOSE are donor heritage. |
| L65 | “never throws” | OK — matches reality. |
| L69-82 | “Built-in periodic checks” — 30s setInterval, ring buffer, 5 checks |
Drift — no setInterval, no ring buffer, no periodic checks in Phase 0. |
| L84-94 | “Other system-level tools” — unified_init, unified_vitals |
Drift — these are donor tool names, not Phase 0 tools. Phase 0 system surface is server_ping + server_health only (2 tools). unified_* are heritage per R75 Wave H.2 purge + docs/reference/mcp-tools-phase-0.md. |
| L98-111 | Logging + COLIBRI_LOG_LEVEL |
OK — matches src/config.ts. |
| L115-121 | Cross-references | Mixed — ADR-004 link still true (but ADR-004 was reconciled Wave H to match 14 tools); boot.md, database.md, middleware.md, s18, s17 all still exist. |
3. Documented-but-not-real — strike list
The following fields and parameters appear in health.md but do not exist
in the live handler. Every one must be removed or moved to an explicit
“Phase 1+ future” callout per the R82.H acceptance criterion:
3.1 Input parameters
detail=summary\|fullrequest parameter (L20 prose + L22-27 schema block)
3.2 Payload fields (all fabricated)
server_version→ live name isversionuptime_seconds→ live name isuptime_msmode: "phase-0-bootstrap"→ live mode enum isFULL | READONLY | TEST | MINIMALdatabase.pathobject fielddatabase.reachableobject fielddatabase.integrityobject fieldmemory.heap_used_mbmemory.heap_total_mbmemory.rss_mbchecks.sqlite_integritychecks.memory_thresholdchecks.event_loop_lag_mstool_count,tools,capability_set,middleware_registry,process_metrics,dependencies(called out in the manifest drift inventory; none are payload fields in the live handler and none appear in the currenthealth.mdprose — but a sweep still has to return zero for each keyword so drift cannot sneak in)
3.3 Failure-mode / operational claims (fabricated)
SAFE_MODEpayload transitionsDIAGNOSEpayload transitions- Ring buffer of “most recent 20 entries of each check”
- 30-second
setIntervalperiodic check loop - Five built-in checks (SQLite integrity, memory threshold, event-loop lag, watcher stall, task queue depth)
- “Active ζ audit session ID if one is open” in full output
3.4 Peer-tool claims (fabricated heritage)
unified_initas a Phase 0 peer toolunified_vitalsas a Phase 0 peer tool
(Per R75 Wave H reconcile, unified_* was deleted from
docs/reference/mcp-tools-phase-0.md. health.md is the last surviving
citation site.)
3.5 Target-path drift
src/domains/system/tools.ts(L10) — live path issrc/tools/health.ts
3.6 Tool-surface-count drift
- “19-tool surface” (L10) — Phase 0 ships 14 tools.
4. What must survive the rewrite (keep list)
- Jekyll frontmatter block (L1-7) — nav_order, parent, tags, title.
- The “Why an MCP tool, not an HTTP endpoint” rationale paragraph (L14) — it states a design fact that holds: Phase 0 transport is stdio-only and there is no HTTP health endpoint.
- The “never throws” invariant — this is true and load-bearing for callers.
- The Logging +
COLIBRI_LOG_LEVELsection (L98-111) — matchessrc/config.tsverbatim per Wave H reconcile. - The cross-references to ADR-004,
boot.md,database.md,middleware.md,s18-stdio.md,s17-environment.md. Theextractions/links are heritage pointers and stay per CLAUDE.md zoning rules.
5. Dependents — who reads this file
Downstream readers identified via grep -rn "2-plugin/health\|server_health"
docs/ that may need to align after this slice ships:
docs/index.md— no change expected (section index only).docs/2-plugin/index.md— links tohealth.md; no payload claims quoted.docs/2-plugin/boot.md§”Health probe” — describes whenserver_healthis registered; payload shape not quoted (R82.F’s territory, not R82.H).docs/reference/mcp-tools-phase-0.md— listsserver_healthin the 14-tool table (R82.E’s territory, not R82.H).docs/architecture/decisions/ADR-004-tool-surface.md— already reconciled Wave H to match 14 tools.docs/spec/s17-environment.md,s18-stdio.md— linked but not payload dependent.
None of these downstream readers quote the payload JSON. R82.H is
self-contained at docs/2-plugin/health.md.
6. Size estimate for the rewrite
Current file: 121 lines, 7 kB. Post-rewrite target: roughly 80-100 lines, ~5 kB. Removes the fabricated payload + periodic-checks sections entirely; replaces them with a true 6-field payload JSON block and a short “Phase 1+ future” callout for the monitoring features that are aspirational.
7. Commit
audit(r82-h-health): inventory health.md vs live payload