P0.2.1 — Step 2 Contract
Behavioral contract for src/server.ts (α System Core — MCP server bootstrap). Every invariant is verifiable. Step 3 (packet) may not expand scope without amending this document first. Step 5 (verify) MUST cite each section as pass/fail.
Upstream authority:
docs/guides/implementation/task-breakdown.md §P0.2.1(acceptance checklist)docs/spec/s17-mcp-surface.md§2, §4, §6, §8 (stdio-only, 5-stage chain, response envelope)docs/2-plugin/boot.md§1-6 (boot sequence, Promise gate, exit codes)docs/2-plugin/middleware.mdStage 1-5 (per-stage semantics)docs/2-plugin/modes.md§”Tool surface per mode” (admitted tools per mode)docs/architecture/decisions/ADR-004-tool-surface.md(19-tool inventory,server_pingname)- Sigma dispatch brief (uniform envelope,
AuditSinkseam, donor-bug mitigation) - Audit
a7305e43(surface inventory + spec-contradiction catalogue)
Where the dispatch brief tightens the spec it is because the expanded detail (uptime_ms, correlationId, AuditSink) feeds P0.2.3 + P0.7 directly.
§1. Module surface (exports of src/server.ts)
src/server.ts MUST export exactly these symbols, no more, no fewer:
1a. Types
AuditSink— interface. Pluggable seam for ζ Decision Trail (P0.7). See §5 for the exact shape.ToolEnterEvent— the shapeAuditSink.enter()receives. Frozen at runtime viaObject.freezeat construction time.ToolExitEvent— the shapeAuditSink.exit()receives. Frozen.ColibriToolConfig— the config object passed toregisterColibriTool():{ title?: string; description?: string; inputSchema: ZodObject<any>; outputSchema: ZodObject<any> }.inputSchemais required so every tool has a Zod schema per s17 §5 (“A tool without a Zod schema cannot be registered”).ColibriServerContext— the value returned bycreateServer(...). Opaque to callers except forregisterColibriTool,start,stop,getVersion,getMode,isConnected.
1b. Functions
createServer(options?: CreateServerOptions): ColibriServerContext— factory. See §3 for inputs.registerColibriTool<I extends ZodRawShape, O extends ZodRawShape>(ctx: ColibriServerContext, name: string, config: ColibriToolConfig, handler: (args: z.infer<ZodObject<I>>) => Promise<unknown> | unknown): void— register a tool with the 5-stage chain. See §4.start(ctx: ColibriServerContext): Promise<void>— runs boot steps 1-4 (server creation happens increateServer, but transport connect + handshake wait happen instart). See §6.stop(ctx: ColibriServerContext): Promise<void>— graceful shutdown. Closes transport; no DB to flush in P0.2.1.
1c. Defaults (named exports)
createNoOpAuditSink(): AuditSink— production default for P0.2.1 (before ζ lands).
1d. Entry-point behavior
main()— IIFE-style default entry that readsconfig/detectModeand callscreateServer+start. Guarded byimport.meta.url === pathToFileURL(process.argv[1]).hrefso importingsrc/server.tsin tests does NOT runmain(). When invoked as a module (npm start→node dist/server.js),main()runs.
No default export. Every symbol is a named export (matches src/config.ts + src/modes.ts convention).
No additional symbols. The 5 middleware stages are INTERNAL to src/server.ts and not re-exported; P0.2.4 re-homes them into src/middleware/*.ts and the public API of src/server.ts does not change.
§2. OPEN QUESTION for T0 — tool-lock stage semantics
The audit (§7a) surfaced a spec contradiction:
- s17 §4 + middleware.md Stage 1: pure per-tool mutex. Concurrency control only. Never rejects a call.
- modes.md §”What ‘admitted’ means”: consults the active mode and rejects with
ToolNotAdmittedErrorat the lock stage, before validation.
Contract’s proposed reading (pending T0 sign-off):
Stage 1 tool-lock in P0.2.1 is pure per-tool mutex, matching s17 + middleware.md. Capability-gating (which tools are admitted per mode) is a SEPARATE concern handled at TOOL REGISTRATION time:
registerColibriToolinspects the current mode viacapabilitiesFor(mode)and the per-toolrequiresset (added in a later task — P0.4.2 or P0.2.3), and refuses to register tools the current mode does not admit. In P0.2.1 there is onlyserver_ping, which every mode admits per modes.md line 57, so no gating logic is exercised.
This reading aligns three of four specs (s17, middleware.md, boot.md) and defers the modes.md scope collision to a later task that can fix modes.md explicitly. T0 must sign off on this reading before the packet’s Step-4 implementation begins. If T0 directs the modes.md reading instead, the contract and packet must be amended to add capability-gating to stage 1.
The rest of this contract assumes the pure-mutex reading.
§3. createServer semantics
3a. Signature
export function createServer(options?: CreateServerOptions): ColibriServerContext;
export interface CreateServerOptions {
readonly auditSink?: AuditSink;
readonly transport?: Transport; // Transport from @modelcontextprotocol/sdk
readonly version?: string; // Override; default = read from package.json
readonly mode?: RuntimeMode; // Override; default = detectMode(process.env)
readonly nowMs?: () => number; // Override; default = () => performance.now()
readonly bootStartMs?: number; // Override; default = nowMs() at construction
readonly logger?: (...args: unknown[]) => void; // Override; default = console.error
}
All options have defaults; passing {} or omitting the argument is valid. Options are the dependency-injection seams tests use to exercise the server without real stdio, real filesystem reads, or real time.
3b. Default wiring (invoked when options are absent or partial)
version←readPackageJson()— reads../package.jsonrelative toimport.meta.url(resolved viapath.resolve(path.dirname(fileURLToPath(import.meta.url)), '..', 'package.json')), parses, returns.version. ThrowsErrorwith message"Failed to read package.json version: <reason>"on failure. Called once at construction; result is cached on the context.mode←detectMode(process.env)fromsrc/modes.ts. Throws onAMS_MODEor any invalidCOLIBRI_MODEvalue.auditSink←createNoOpAuditSink().transport←new StdioServerTransport()(uses defaultprocess.stdin/process.stdout).nowMs←() => performance.now()fromnode:perf_hooks.bootStartMs←nowMs()at the momentcreateServeris called. Pinguptime_msis calculated asfloor(nowMs() - bootStartMs).logger←console.error. Chosen becauseconsole.logwrites to stdout which is owned by the SDK’s stdio transport (donor bug #3); stderr is safe for our messages.
3c. What createServer DOES NOT do
- Does NOT call
transport.connect(). That happens instart(). This split lets tests observe the constructed-but-not-connected state. - Does NOT call
process.on(...). Global handlers are installed bystart()(or by themain()entry). Tests that callcreateServerin isolation do NOT pollute Jest’s handlers. - Does NOT read
process.envdirectly. All env reads go throughconfig(fromsrc/config.ts) ordetectMode(fromsrc/modes.ts). - Does NOT throw if
auditSinkis missing — the no-op default is used.
§4. registerColibriTool — the 5-stage middleware chain
4a. Signature
export function registerColibriTool<
I extends z.ZodRawShape,
O extends z.ZodRawShape,
>(
ctx: ColibriServerContext,
name: string,
config: {
readonly title?: string;
readonly description?: string;
readonly inputSchema: z.ZodObject<I>;
readonly outputSchema?: z.ZodObject<O>;
},
handler: (
args: z.infer<z.ZodObject<I>>,
) => Promise<unknown> | unknown,
): void;
Calls ctx.server.registerTool(name, sdkConfig, wrappedHandler) under the hood, where wrappedHandler is the composed 5-stage chain.
4b. The 5 stages (canonical order, matches s17 §4 + middleware.md)
Each stage is a thin async function. They compose as follows (pseudocode, not the implementation):
wrappedHandler = async (args) => {
return await toolLock(name, async () => {
const validated = schemaValidate(config.inputSchema, args);
const correlationId = uuidV4();
const enterTs = nowMs();
let result, error;
try {
await auditEnter({ tool: name, args: validated, timestamp: enterTs, correlationId });
result = await dispatch(handler, validated);
return { ok: true, data: result };
} catch (e) {
error = e;
throw e;
} finally {
await auditExit({
tool: name,
correlationId,
durationMs: floor(nowMs() - enterTs),
result,
error,
});
}
});
};
4c. Stage-by-stage contract
Stage 1 — tool-lock.
- Input: tool name (string).
- Behavior: per-tool mutex. The lock map lives on
ColibriServerContextas a privateMap<string, Promise<void>>. Acquiring the lock:awaitthe currentPromiseforname(if any), then replace with a newPromisethat resolves when the current call’sfinallyruns. - Output: runs the inner chain exactly once per tool-name, serialized.
- Failure handling: the lock itself cannot fail. If the inner chain throws, the mutex still releases (via the outer
finally). - In P0.2.1: does NOT consult mode. See §2 OPEN QUESTION.
Stage 2 — schema-validate.
- Input: the Zod schema (from
config.inputSchema), raw args (unknown). - Behavior:
config.inputSchema.safeParse(args). On.success === false, throwSchemaValidationError(an internal class) with the Zod error tree attached. The outer envelope maps this to{ ok: false, error: { code: 'INVALID_PARAMS', message, details: { issues } } }at response-serialization time..success === trueyields the typed.dataobject, which is passed to stage 3 and stage 4. - Failure handling: throw. The chain stops here; stage 3 (audit-enter) does NOT run. This matches middleware.md Stage 2: “a call that never validated never enters the decision trail.”
Stage 3 — audit-enter.
- Input:
ToolEnterEvent = { tool, args, timestamp, correlationId }. - Behavior:
await ctx.auditSink.enter(event). In P0.2.1, the no-op sink does nothing; in P0.7, the ζ sink writes anaudit_eventsrow. - Failure handling: per middleware.md Stage 3 (“Audit insert failure is a hard stop”), if
auditSink.enter()throws, the chain stops and the error propagates. Stages 4 and 5 do NOT run for this call. However, stage 5 (audit-exit) DOES run from the outerfinally— see §4d for the ordering subtlety.
Stage 4 — dispatch.
- Input: the typed args (from stage 2).
- Behavior:
await handler(args). The returned value is wrapped in{ ok: true, data: <value> }. - Failure handling: exceptions propagate to the outer
finally. Stage 5 still runs.
Stage 5 — audit-exit.
- Input:
ToolExitEvent = { tool, correlationId, durationMs, result, error }. - Behavior:
await ctx.auditSink.exit(event). Runs unconditionally in the chain-levelfinallyblock, regardless of whether stage 3 or stage 4 threw. - Failure handling: if
auditSink.exit()itself throws, the error is logged viactx.loggerbut is NOT re-thrown (the ORIGINAL error from stage 3 or 4, if any, is preserved). This avoids double-fault patterns where a flaky sink hides the real handler error. Deviation from middleware.md: middleware.md says “exit-row insert failure is a hard stop”, but ALSO says “the handler’s own result is dropped in favour of the insert error”. In P0.2.1 with a no-op sink the question is moot; when P0.7 lands a real sink, the contract here may need re-negotiation. Flagged to T0 as a secondary open question (§2.2).
4d. Error propagation table
| Stage that throws | Outcome | What the client sees |
|---|---|---|
| 1 tool-lock | Impossible (acquisition-only) | — |
| 2 schema-validate | Chain stops; stages 3-4 skipped; stage 5 runs (records the validation error) | { ok: false, error: { code: 'INVALID_PARAMS', message, details: { issues } } } |
| 3 audit-enter | Chain stops; stage 4 skipped; stage 5 runs from outer finally (records the audit error) |
{ ok: false, error: { code: 'AUDIT_ENTER_FAILED', message } } |
| 4 dispatch (handler) | Stage 5 runs from outer finally (records the handler error) |
{ ok: false, error: { code: 'HANDLER_ERROR', message, details?: { stack in non-prod } } } |
| 5 audit-exit | Logged via ctx.logger; if stages 2-4 already produced an error, that error is what the client sees; otherwise the client sees a success envelope with a warning log on the server |
Original stage-2-4 error if any; otherwise success |
All errors are mapped to JSON-RPC error codes per s17 §6 + docs/spec/s05-errors.md before returning. In P0.2.1 the mapping is simple (only HANDLER_ERROR and INVALID_PARAMS are reachable via server_ping); P0.3+ exercises the full table.
4e. Registration-time invariants
nameMUST be a non-empty string matching/^[a-z_][a-z0-9_]*$/(snake_case). The helper asserts this and throwsError('invalid tool name: <name>')otherwise.config.inputSchemaMUST be az.ZodObject<any>. The helper’s type signature enforces this at compile time; at runtime a defensiveinstanceof z.ZodObjectcheck throwsError('inputSchema must be a Zod object')for JavaScript callers (belt-and-braces).- Calling
registerColibriToolwith the samenametwice throwsError('tool already registered: <name>'). This matches SDK behavior (the underlyingMcpServer.registerToolalso throws on duplicate).
§5. AuditSink interface (the P0.7 seam)
export interface ToolEnterEvent {
readonly tool: string;
readonly args: unknown; // post-validation, typed at the stage 2 boundary
readonly timestamp: number; // nowMs() at chain-enter; ms since process start or since epoch — see note
readonly correlationId: string; // uuid v4
}
export interface ToolExitEvent {
readonly tool: string;
readonly correlationId: string;
readonly durationMs: number; // floor(nowMs() - enterTs)
readonly result?: unknown; // present on success
readonly error?: Error; // present on failure
}
export interface AuditSink {
enter(event: ToolEnterEvent): Promise<void> | void;
exit(event: ToolExitEvent): Promise<void> | void;
}
5a. Design rationale
enterandexitare called at stage 3 and stage 5 respectively.- Both methods return
Promise<void> | voidso a sync sink can omit thePromise; the chain alwaysawaits, so sync sinks incur one microtask hop. correlationIdis generated by stage 3 (NOT stage 2 or stage 1) because the identifier should only exist for calls that validated. This matches middleware.md Stage 3 (“a call that never validated never enters the decision trail”). ThecorrelationIdis the join key thataudit_exituses to close the entry row.timestampsemantics:performance.now()returns milliseconds since process start. This is fine for Phase 0 where the sink is process-local; P0.7 may need to switch toDate.now()for cross-session audit anchoring. The contract documents both options; the packet locks inperformance.now()for P0.2.1 because it is monotonic and immune to system-clock skew. P0.7 will wrap both values into the audit row if needed.argsis UNKNOWN-typed, not a generic — the sink is tool-agnostic. The sink implementor (ζ) is responsible for stable-serializing and hashing.erroris typedError(notunknown) so sinks can call.stackwithout a narrowing. JavaScriptthrow "string"hits stage 5 wrapped asnew Error(String(value))by the chain.
5b. createNoOpAuditSink()
export function createNoOpAuditSink(): AuditSink {
return Object.freeze({
enter(): void { /* no-op */ },
exit(): void { /* no-op */ },
});
}
Returns a module-scoped singleton frozen object. Multiple calls MAY return the same reference (implementation detail; tests MUST NOT assume referential identity across calls).
5c. T0 sign-off asked on
- Field names (
tool,args,timestamp,correlationId,durationMs,result,error) — any rename must propagate to P0.7. - Method names (
enter,exit). Alternatives considered:onEnter/onExit(reject — MORE verbose);open/close(reject — overloaded with DB semantics). - Return type
Promise<void> | void(alternative: alwaysPromise<void>). Current choice lets sync sinks skip theasynckeyword; asymmetric async cost is negligible.
§6. start and main — boot sequence
6a. start(ctx) signature
export function start(ctx: ColibriServerContext): Promise<void>;
Returns when the transport is connected and the handshake has completed. Never resolves to a value.
6b. start() behavior
- Install global handlers (if not already installed):
process.on('unhandledRejection', (reason) => { ctx.logger('[colibri] unhandledRejection:', reason); process.exit(1); })process.on('uncaughtException', (err) => { ctx.logger('[colibri] uncaughtException:', err); process.exit(1); })- Installation is idempotent (checks
process.listenerCount('unhandledRejection') === 0before installing) so tests that invokestart()multiple times do not pile up handlers.
- Log one line:
ctx.logger('[colibri] starting in mode=', ctx.mode, 'version=', ctx.version). await Promise.race([ ctx.server.connect(ctx.transport), new Promise((_, reject) => setTimeout(() => reject(new Error('startup timeout exceeded')), config.COLIBRI_STARTUP_TIMEOUT_MS)) ]). On timeout, exit with code 75 (resource) per boot.md §”Startup Timeout”.- Log one line:
ctx.logger('[colibri] ready').
6c. stop(ctx) signature and behavior
export function stop(ctx: ColibriServerContext): Promise<void>;
Calls ctx.server.close(). No global-handler cleanup (Node’s own process shutdown drains them). Returns when the transport is closed. Tests use this for teardown.
6d. main() — the entry-point IIFE
Runs only when the module is executed directly (via node dist/server.js), not when imported:
if (import.meta.url === pathToFileURL(process.argv[1] ?? '').href) {
await main();
}
async function main(): Promise<void> {
const ctx = createServer();
try {
await start(ctx);
// In P0.2.1 there are no domain handlers to load; start() is the whole lifecycle.
// Process stays alive because stdio is open.
} catch (err) {
ctx.logger('[colibri] fatal:', err);
process.exit(1);
}
}
Not strictly tested at the unit level (entry-point IIFEs are notoriously hard to unit-test without spawnSync). §8d specifies the coverage boundary.
§7. server_ping tool (the one registered tool)
7a. Name
server_ping (snake_case, no slash). Matches ADR-004 + S17. The task-breakdown + task-prompt text server/ping is treated as heritage draft and overridden (audit §7b).
7b. Input schema
const pingInput = z.object({});
No parameters. Stage 2 parses {} on every call.
7c. Output schema
Two levels:
- Handler return (the value
handler()produces):{ version: string; mode: RuntimeMode; uptime_ms: number }. Zod schemaz.object({ version: z.string(), mode: z.enum(['FULL','READONLY','TEST','MINIMAL']), uptime_ms: z.number().int().nonnegative() }). - Wire envelope (what the client sees):
{ ok: true, data: <handler return> }per s17 §6.
Failure envelope (if the handler somehow fails — should be unreachable for server_ping in P0.2.1): { ok: false, error: { code: 'HANDLER_ERROR', message } }.
7d. Handler implementation
async (_args) => ({
version: ctx.version,
mode: ctx.mode,
uptime_ms: Math.floor(ctx.nowMs() - ctx.bootStartMs),
});
No I/O, no async work, deterministic given ctx. nowMs is injected for test determinism.
7e. Response-time invariant
Per docs/reference/mcp-tools-phase-0.md (the server_ping row implicit in ADR-004 Phase-0 inventory): handler completes in under 100 ms end-to-end in FULL mode on a typical dev machine. Not asserted in tests (timing-sensitive), but the handler does no I/O so this is guaranteed by construction.
§8. Test contract (WHAT must be covered, not HOW)
The test file is src/__tests__/server.test.ts (per audit §2e). It MUST cover:
8a. createServer unit tests
- T-1
createServer()with all defaults returns a context withversionmatchingpackage.json#version. - T-2
createServer({ version: '9.9.9-test' })overrides the version. - T-3
createServer({ mode: 'READONLY' })overrides the mode (bypassesdetectMode). - T-4
createServer()does NOT calltransport.connect()(use an injected fake transport and assert nostart()was called). - T-5
createServer()does NOT installunhandledRejection/uncaughtExceptionhandlers (count listeners before/after). - T-6
createServer()readsconfigwithout throwing given a valid env (spawnstsxsubprocess OR uses the already-loadedconfig). - T-7
createServer({ auditSink: customSink })wires the custom sink (observable by registering a tool and triggering enter/exit).
8b. registerColibriTool unit tests
- T-8 registering
server_pingsucceeds; the tool is retrievable viactx.server.isConnected() === false, ctx.server._registeredTools['server_ping'](or equivalent SDK introspection). - T-9 registering the same name twice throws.
- T-10 registering with an invalid name (
'server-ping'with a dash) throws. - T-11 registering with a non-Zod-object
inputSchemathrows.
8c. 5-stage middleware tests (end-to-end, via the SDK’s in-memory transport)
The SDK ships an InMemoryTransport (in node_modules/@modelcontextprotocol/sdk/dist/esm/inMemory.js) — NOT StdioServerTransport. Tests pass a pair of linked InMemoryTransport instances via createServer({ transport: serverHalf }) and client.connect(clientHalf), then use the client to call tools. This exercises the full chain without real stdio.
- T-12 calling
server_pingreturns{ ok: true, data: { version, mode, uptime_ms } }with all three fields present and correctly typed. - T-13
uptime_msis non-negative. - T-14 a custom
AuditSinkobserves exactly oneenterand oneexitper successful call, with matchingcorrelationId+tool='server_ping'+ a positivedurationMs. - T-15 a custom
AuditSinkseesexit.resultis defined andexit.erroris undefined on success. - T-16 calling a handler that throws causes
exit.errorto be the thrownErrorandexit.resultto be undefined. - T-17 calling
server_pingtwice concurrently (viaPromise.all) serializes — the twoenterevents are observed in order (not interleaved withexitevents in an incorrect pattern). - T-18 a handler with a deliberately-failing Zod schema (register a test-only tool with
z.object({ required: z.string() })and call with{}) yields{ ok: false, error: { code: 'INVALID_PARAMS' } }and theenter/exitsink sees exactly one pair (exit records the error).
8d. Boot-sequence tests
- T-19
start(ctx)with an injected stub transport resolves oncetransport.start()has been invoked. - T-20
start(ctx)rejects with a timeout error iftransport.connect()never resolves (overrideCOLIBRI_STARTUP_TIMEOUT_MSvia a test-localcreateServer({ ... })override OR by using a very short fake timeout). - T-21
start(ctx)installsunhandledRejection+uncaughtExceptionlisteners (assert count incremented by at least 1 for each). - T-22
start(ctx)is idempotent w.r.t. listener installation — callingstarttwice does not double-register. - T-23
stop(ctx)callstransport.close().
8e. Negative / regression tests
- T-24
createServer({})does NOT overrideprocess.stdout.write. Assertion:process.stdout.write === <captured-before>. (Donor bug #3.) - T-25
src/server.tsimports no HTTP/WebSocket SDK module. Assertion: grep the module source (or inspectimportgraph). - T-26
createServer()propagates adetectModethrow (e.g.AMS_MODEset) when called without an override — uses atsxsubprocess becausesrc/config.tseagerly reads env at module load.
8f. Coverage invariant
src/server.tsreaches 100% statement / function / line and ≥90% branch coverage.src/config.ts+src/modes.tscoverage unchanged.
Total new tests target: 26 tests added. Combined with 15 (config) + 24 (modes) + 1 (smoke) = 66 tests total. Packet §2 finalizes the exact test names.
§9. Integration points — stability
After P0.2.1 lands, the following are the stable API:
createServer(options?)— signature frozen for Phase 0; adding optional fields toCreateServerOptionsis non-breaking.registerColibriTool(ctx, name, config, handler)— signature frozen for Phase 0; P0.3+ domain tools call this.AuditSink— interface frozen for Phase 0; P0.7 implements it.
The following are internal and may change without notice:
- The tool-lock
Mapshape (currentlyMap<string, Promise<void>>). - The 5-stage composition implementation (may factor out into
src/middleware/*.tsper P0.2.4). - The exact wire format of JSON-RPC error codes (locked by s17 §6 eventually).
§10. Files touched
Exhaustive list (new files + modifications):
New files:
src/server.ts— the server bootstrap module. ~350-450 lines.src/__tests__/server.test.ts— test file. ~500-650 lines.
Modified files:
src/index.ts— REMAIN asexport {};(no change), OR deleted if the contract decides to remove it. Contract decision: remain as placeholder, becausepackage.json#mainpoints atdist/server.jsandsrc/index.tsserves no purpose after P0.2.1 — but deleting it risks confusing future contributors who expect an index. Leave for a dedicated cleanup task.
NOT modified in P0.2.1:
package.json— no new deps (SDK already in).tsconfig.json— existing config suffices.jest.config.ts— existing config suffices..eslintrc.json— existing override for.test.tssuffices..env.example— no new env vars in P0.2.1.- Any of
src/config.ts,src/modes.ts— read only.
§11. Out-of-scope — EXPLICIT LIST
P0.2.1 does NOT ship any of these (deferred to cited downstream tasks):
- SQLite DB open or migration (P0.2.2).
- Two-phase startup splitting
start()into transport + heavy-init (P0.2.3). - Health tool
server_health— different task (P0.2.4 per task-breakdown, or P0.4.2 per modes.md). server_info,server_shutdowntools (P0.2.4+ / γ lifecycle). R75 Wave H update: neither tool shipped.server_infowas struck as a phantom;server_shutdownwas deferred. Capability reporting is folded intoserver_health; process teardown is SIGTERM/SIGINT-level.- Middleware layer extraction into
src/middleware/*.ts(P0.2.4). - Any β task tool (
task_create, etc.) (P0.3). - Any ε skill tool (
skill_list) (P0.6). - Any ζ thought tool (
thought_record, etc.) (P0.7). - Any η merkle tool (P0.8).
- Any ν integration (P0.9).
- Real ζ
AuditSinkimplementation (P0.7). - HTTP transport (NEVER in Phase 0 per s17 §2).
- WebSocket transport (NEVER in Phase 0 per s17 §2).
- Multi-model routing / δ (Phase 1.5 per ADR-005).
- Capability-gating at tool-lock (pending T0 decision in §2 — either this task adds it or a later task does).
- Rate limiting, ACL, circuit breaker, retry (per middleware.md appendix — earned by later concepts).
§12. Contract acceptance criteria (the checklist Step 5 verify cites)
- §1 — exports exactly the 11 symbols listed, no more.
- §2 — tool-lock semantics reading is pure-mutex (pending T0 sign-off).
- §3 —
createServer({})succeeds with defaults; every option is injectable. - §4 — 5 stages run in canonical order; error propagation table holds.
- §5 —
AuditSinkinterface matches the exact shape in §5. - §6 —
start()installs global handlers idempotently;stop()closes the transport. - §7 —
server_pingreturns the{ ok: true, data: { version, mode, uptime_ms } }envelope. - §8 — all 26 test categories pass.
- §8f — coverage invariant holds.
- §10 — only
src/server.ts+src/__tests__/server.test.tsare new;src/index.tsunchanged. - §11 — no out-of-scope code snuck in.
- CI
docs-check+build-test-linton Node 20 is green.
§13. Risks (surfaced to the packet)
- R-1 package.json resolution under ESM + ts-jest.
path.resolve(fileURLToPath(import.meta.url), '..', '..')behaves differently at test time (ts-jest transforms.tssources) vs run time (compileddist/server.js). The packet §1 locks a resolution idiom that works in both. - R-2 SDK
registerTooltype ergonomics. The SDK’sregisterTool<OutputArgs, InputArgs>usesZodRawShapeCompatnotZodObject. The helper signature may need to unwrapZodObject.shapeat the boundary. Packet §1d specifies the exact adapter code. - R-3 In-memory transport availability. If
@modelcontextprotocol/sdk/inMemory.jsdoes not expose what we need, tests may fall back to calling the wrapped handlers directly (bypassing the SDK’s JSON-RPC layer). Packet §2 specifies the fallback. - R-4 Coverage threshold. Some error branches (e.g.
Failed to read package.json) are hard to trigger without mockingfs. Packet §3 specifies the fs-injection strategy (either viaCreateServerOptions.readPackageJson?: () => stringor viajest.mock). - R-5 Global handlers & Jest. Jest installs its own handlers. If
start()installs ours on top, a test-level crash is now handled by OUR listener which callsprocess.exit(1)— which kills the test runner. Packet §2d specifies that test-timestart()invocations override viaCreateServerOptions.installGlobalHandlers?: boolean(defaulted totrue).
§14. Open questions deferred to the packet
- Q-1 (§2): T0’s reading of tool-lock stage — pure mutex vs capability-gate?
- Q-2 (§4c): if
auditSink.exit()throws, do we re-throw or just log? Contract proposes log-and-continue; middleware.md says hard-stop. P0.2.1 ships log-and-continue (the no-op sink cannot throw anyway); P0.7 may revisit. - Q-3 (§6d): does
main()get unit-test coverage? Contract defers to packet; likely answer is “covered via spawnSync tsx subprocess integration test” or “excluded via coverage pragma”. - Q-4 (§10): keep or delete
src/index.ts? Contract says keep. - Q-5 (§8c): use the SDK’s
InMemoryTransportor fake streams viaPassThrough? Contract leaves this to the packet to decide based on what exactly the SDK exports.
P0.2.1 Step 2 Contract — 2026-04-16 — branch feature/p0-2-1-mcp-server. Audit: a7305e43. Gates Step 3 Packet.