Audit — R82.H docs/2-plugin/health.md vs live server_health payload

Scope. Non-mutating inventory of the drift between docs/2-plugin/health.md and the shipped server_health handler. Lists every field, parameter, and failure-mode claim documented but not actually produced by the Phase 0 code. Commit this file before the contract.

Target file: docs/2-plugin/health.md (121 lines, 7 kB). Live implementation: src/tools/health.ts (175 lines). Tool registration: src/server.ts:555 (registerHealthTool(ctx) after server_ping). Task spec of record: docs/guides/implementation/task-breakdown.md §P0.2.4. Manifest slice: R82 r82-phase-0-1-stabilization/manifest.md row H.


1. Live payload — exact source

The handler is buildHealthPayload(ctx: ColibriServerContext): HealthPayload at src/tools/health.ts:130-139:

export function buildHealthPayload(ctx: ColibriServerContext): HealthPayload {
  return {
    status: 'ok',
    version: ctx.version,
    uptime_ms: Math.floor(ctx.nowMs() - ctx.bootStartMs),
    db_tables: countTables(ctx.db),
    phase: ctx.phase ?? 'phase1',
    mode: ctx.mode,
  };
}

The payload Zod schema is pinned at src/tools/health.ts:72-79:

export const healthPayloadSchema = z.object({
  status: z.literal('ok'),
  version: z.string().min(1),
  uptime_ms: z.number().int().nonnegative(),
  db_tables: z.number().int().nonnegative(),
  phase: z.enum(['phase1', 'phase2']),
  mode: z.enum(RUNTIME_MODES),
});

RUNTIME_MODES (src/modes.ts:31) = ['FULL', 'READONLY', 'TEST', 'MINIMAL'].

1.1 Live payload shape (ground truth for the contract)

Field Type Source Notes
status literal "ok" src/tools/health.ts:132 + schema L73 Pinned as z.literal('ok') — handler never throws, never returns “degraded” / “error”. The manifest’s T0 drift-inventory line lists the six field names only; it does not assert a multi-value status enum.
version non-empty string ctx.version at L133 Populated from package.json.version at createServer time (server.ts:201-215).
uptime_ms non-negative integer Math.floor(ctx.nowMs() - ctx.bootStartMs) at L134 performance.now() delta, floored to integer ms.
db_tables non-negative integer countTables(ctx.db) at L135 Counts user tables via SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'. Returns 0 when ctx.db is undefined (Phase 1) or query throws.
phase "phase1" | "phase2" ctx.phase ?? 'phase1' at L136 phase1 during transport-only boot; phase2 after startup.ts opens the DB.
mode "FULL" | "READONLY" | "TEST" | "MINIMAL" ctx.mode at L137 From detectMode(process.env) or CreateServerOptions.mode.

1.2 Live input schema

src/tools/health.ts:57:

const inputSchema = z.object({});

The tool takes zero arguments. No detail parameter, no enum. Non-object args are rejected by stage-2 schema-validate middleware with the standard INVALID_PARAMS envelope (covered by the describe 3 → "rejects non-object args at stage 2" test at src/__tests__/tools/health.test.ts:388-411).

1.3 Live failure posture

src/tools/health.ts:20-27 is explicit: the handler never throws.

  • ctx.db undefined / closed / locked → db_tables: 0 (via countTables try/catch at L104-111).
  • No SAFE_MODE / DIAGNOSE mode flip inside the health handler — mode is a pure read-through of ctx.mode, which was fixed at createServer time.
  • No 30-second setInterval check loop exists in the Phase 0 codebase (see grep -rn "setInterval" src/tools/ src/server.ts src/startup.ts → zero hits in health-related code).
  • No event-loop lag probe, watcher-stall probe, or queue-depth probe.
  • The status field is the literal "ok". There is no degraded-state reporting pathway in the handler.

2. Documented surface — inventory of docs/2-plugin/health.md

File is 121 lines. Sections and their drift status:

Line range Section Drift
L1-7 Jekyll frontmatter OK — preserve.
L8-13 Header + ADR-004 + src/domains/system/tools.ts path Drift. Live path is src/tools/health.ts, not src/domains/system/tools.ts. The 19-tool phantom surface is also stale (Phase 0 ships 14 tools per R82 manifest + CLAUDE.md §10).
L14 “Why MCP, not HTTP” OK — true statement, no drift.
L16-27 Tool contract — inputs + detail=summary\|full schema example Drift. Live tool takes no arguments. detail does not exist in the input schema. The schema example cites “Zod 4” — actual is Zod v3.23 per CLAUDE.md §1.
L29-52 Output shape (summary) JSON Drift — entire block is fictional. Uses server_version (not version), uptime_seconds (not uptime_ms), database.{path,reachable,integrity} object, memory.{heap_used_mb,heap_total_mb,rss_mb}, checks.{sqlite_integrity,memory_threshold,event_loop_lag_ms}. Zero of these field names appear in the live payload.
L54 Output shape (full) — “summary fields plus 20 entries of each check’s history” Drift — no detail param; no history ring buffer.
L56-65 Failure modes table (SAFE_MODE, DIAGNOSE, etc.) Drift — none of these modes are written by the handler. SAFE_MODE / DIAGNOSE are donor heritage.
L65 “never throws” OK — matches reality.
L69-82 “Built-in periodic checks” — 30s setInterval, ring buffer, 5 checks Drift — no setInterval, no ring buffer, no periodic checks in Phase 0.
L84-94 “Other system-level tools” — unified_init, unified_vitals Drift — these are donor tool names, not Phase 0 tools. Phase 0 system surface is server_ping + server_health only (2 tools). unified_* are heritage per R75 Wave H.2 purge + docs/reference/mcp-tools-phase-0.md.
L98-111 Logging + COLIBRI_LOG_LEVEL OK — matches src/config.ts.
L115-121 Cross-references Mixed — ADR-004 link still true (but ADR-004 was reconciled Wave H to match 14 tools); boot.md, database.md, middleware.md, s18, s17 all still exist.

3. Documented-but-not-real — strike list

The following fields and parameters appear in health.md but do not exist in the live handler. Every one must be removed or moved to an explicit “Phase 1+ future” callout per the R82.H acceptance criterion:

3.1 Input parameters

  • detail=summary\|full request parameter (L20 prose + L22-27 schema block)

3.2 Payload fields (all fabricated)

  • server_version → live name is version
  • uptime_seconds → live name is uptime_ms
  • mode: "phase-0-bootstrap" → live mode enum is FULL | READONLY | TEST | MINIMAL
  • database.path object field
  • database.reachable object field
  • database.integrity object field
  • memory.heap_used_mb
  • memory.heap_total_mb
  • memory.rss_mb
  • checks.sqlite_integrity
  • checks.memory_threshold
  • checks.event_loop_lag_ms
  • tool_count, tools, capability_set, middleware_registry, process_metrics, dependencies (called out in the manifest drift inventory; none are payload fields in the live handler and none appear in the current health.md prose — but a sweep still has to return zero for each keyword so drift cannot sneak in)

3.3 Failure-mode / operational claims (fabricated)

  • SAFE_MODE payload transitions
  • DIAGNOSE payload transitions
  • Ring buffer of “most recent 20 entries of each check”
  • 30-second setInterval periodic check loop
  • Five built-in checks (SQLite integrity, memory threshold, event-loop lag, watcher stall, task queue depth)
  • “Active ζ audit session ID if one is open” in full output

3.4 Peer-tool claims (fabricated heritage)

  • unified_init as a Phase 0 peer tool
  • unified_vitals as a Phase 0 peer tool

(Per R75 Wave H reconcile, unified_* was deleted from docs/reference/mcp-tools-phase-0.md. health.md is the last surviving citation site.)

3.5 Target-path drift

  • src/domains/system/tools.ts (L10) — live path is src/tools/health.ts

3.6 Tool-surface-count drift

  • “19-tool surface” (L10) — Phase 0 ships 14 tools.

4. What must survive the rewrite (keep list)

  • Jekyll frontmatter block (L1-7) — nav_order, parent, tags, title.
  • The “Why an MCP tool, not an HTTP endpoint” rationale paragraph (L14) — it states a design fact that holds: Phase 0 transport is stdio-only and there is no HTTP health endpoint.
  • The “never throws” invariant — this is true and load-bearing for callers.
  • The Logging + COLIBRI_LOG_LEVEL section (L98-111) — matches src/config.ts verbatim per Wave H reconcile.
  • The cross-references to ADR-004, boot.md, database.md, middleware.md, s18-stdio.md, s17-environment.md. The extractions/ links are heritage pointers and stay per CLAUDE.md zoning rules.

5. Dependents — who reads this file

Downstream readers identified via grep -rn "2-plugin/health\|server_health" docs/ that may need to align after this slice ships:

  • docs/index.md — no change expected (section index only).
  • docs/2-plugin/index.md — links to health.md; no payload claims quoted.
  • docs/2-plugin/boot.md §”Health probe” — describes when server_health is registered; payload shape not quoted (R82.F’s territory, not R82.H).
  • docs/reference/mcp-tools-phase-0.md — lists server_health in the 14-tool table (R82.E’s territory, not R82.H).
  • docs/architecture/decisions/ADR-004-tool-surface.md — already reconciled Wave H to match 14 tools.
  • docs/spec/s17-environment.md, s18-stdio.md — linked but not payload dependent.

None of these downstream readers quote the payload JSON. R82.H is self-contained at docs/2-plugin/health.md.


6. Size estimate for the rewrite

Current file: 121 lines, 7 kB. Post-rewrite target: roughly 80-100 lines, ~5 kB. Removes the fabricated payload + periodic-checks sections entirely; replaces them with a true 6-field payload JSON block and a short “Phase 1+ future” callout for the monitoring features that are aspirational.


7. Commit

audit(r82-h-health): inventory health.md vs live payload

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.