Contract — R93 B1 outputSchema SDK Passthrough Envelope Mismatch
Round: R93 debug-sweep
Branch: feature/r93-b1-output-schema-envelope
Audit: docs/audits/r93-b1-output-schema-envelope-audit.md
β task: 6b2da36a-8f1d-4a90-95c1-ac549b3fd60e
§1. Behavioural invariants (post-fix)
| ID | Invariant | Verifier |
|---|---|---|
| I-1 | A tool registered with outputSchema returns a {ok: true, data: <handler-result>} envelope through the MCP SDK without triggering -32602. |
Regression test: register with_output_schema, call via Client, assert response.structuredContent.ok === true. |
| I-2 | The eight live tools (router_score, router_fallback, router_stats, consensus_propose, consensus_vote, consensus_finality, consensus_gossip, vrf_eval) are reachable through the MCP wire and return their documented payloads. | Smoke probes against a freshly-bootstrapped server (manual verification + unit-level test asserting no -32602). |
| I-3 | ColibriToolConfig.outputSchema remains an accepted field; passing it does not throw, and the existing accepts an optional description and outputSchema test continues to pass unchanged. |
Existing test at src/__tests__/server.test.ts:441. |
| I-4 | Handler return shapes are unchanged. The 8 affected handlers in src/domains/router/tools.ts + src/domains/consensus/tools.ts are not modified by this slice. |
Source-level diff inspection. |
| I-5 | The Zod outputSchema exports (RouterScoreOutputSchema, RouterFallbackOutputSchema, RouterStatsOutputSchema, ConsensusProposeOutputSchema, ConsensusVoteOutputSchema, ConsensusFinalityOutputSchema, ConsensusGossipOutputSchema, VrfEvalOutputSchema) remain in their source files and continue to flow into TypeScript types via z.infer<typeof X>. |
rg "Output(Schema|Type)" src/domains/{router,consensus} returns unchanged sets. |
| I-6 | The α envelope contract on failure is unchanged: handler throws → {ok:false, error:{code:"HANDLER_ERROR",…}}; Zod input failure → {ok:false, error:{code:"INVALID_PARAMS",…,details:{issues}}}. |
Existing middleware tests. |
| I-7 | The middleware contract documented in docs/2-plugin/middleware.md Stage 1-5 remains accurate. No new stage; no stage removed. |
Manual cross-check; doc untouched. |
| I-8 | The 3492 currently-green tests continue to pass after the patch. | npm test exit 0; suite count preserved. |
| I-9 | npm run build && npm run lint && npm test is the gating triple. |
CI + local verify run. |
§2. Non-invariants (explicitly out of scope)
- N-1. Output runtime validation. Post-fix,
outputSchemais a TypeScript-only artifact for the listed tools. Handlers may return any value; the wire layer does not enforce shape. A future slice (call it B1-extended) may layer Zod-driven output validation inside the middleware with a typedOUTPUT_VALIDATION_FAILEDerror code; that work is not in this slice. - N-2. Per-tool output documentation updates. The slice doc + tool description strings already describe the response shape; no doc rewrite is required.
- N-3. Existing handler logic. Any latent bug in a handler’s return shape is preserved by this slice; the regression test only verifies the SDK doesn’t reject a well-shaped handler return — it does not verify handler correctness.
§3. Required test coverage
| Test | File | Asserts |
|---|---|---|
outputSchema-declared tool round-trips through the SDK without -32602 |
src/__tests__/server.test.ts (new it inside describe('5-stage middleware (InMemoryTransport end-to-end)')) |
Registers a tool with both inputSchema and outputSchema, invokes it via Client.callTool, asserts response.structuredContent matches {ok:true, data:{<expected handler payload>}} and that the call does not throw. |
This is the load-bearing regression test. It must FAIL on the pre-fix tree and PASS on the post-fix tree. The existing acceptance test at server.test.ts:441 is preserved unchanged.
§4. Acceptance criteria
- AC-1.
git diff origin/main..HEAD -- src/server.tsshows a single-region change inregisterColibriToolremoving the conditional outputSchema spread on the SDKregisterToolconfig object (audit §2.2). - AC-2.
git diff origin/main..HEAD -- src/__tests__/server.test.tsshows one newitinside the InMemoryTransport e2e block (audit §5). - AC-3.
npm run buildexits 0 with no new warnings. - AC-4.
npm run lintexits 0 with no new warnings. - AC-5.
npm testexits 0; suite count = 79 (no removed/skipped suites); test count ≥ 3493 (the new regression test). - AC-6. The PR body documents the bug + reproduction + fix + the live tool surface unblocked.
- AC-7. Writeback per CLAUDE.md §7:
thought_record(reflection)followed bytask_update(status="DONE").
§5. Risks
- R-1. A test elsewhere in the suite might depend on the SDK enforcing
outputSchemaand would silently pass when validation is gone. Mitigation: grepoutputSchemain the test corpus and audit each hit. - R-2. The SDK could in future enforce
outputSchemapurely on the typed return ofregisterTool’s handler arg without inspectingstructuredContent. That’s a wider-API question and not part of this slice. - R-3. A downstream consumer of the
*.OutputSchemaZod exports might be using them for runtime validation (e.g. in a sub-agent dispatcher). GrepOutputSchemaacrosssrc/**to confirm only TypeScript-type use, notparse()/safeParse()consumption.
Proceeding to packet.