Contract: P0.8.3 η Merkle Root Finalization Tools
Task: P0.8.3 — three new MCP tools: audit_session_start, merkle_finalize, merkle_root
Depends on: P0.8.1 (primitives) + migration 006_eta.sql (earned here)
Consumed by: writeback protocol (CLAUDE.md §7); future P0.7.3 audit_verify_chain.
1. Scope
A thin SQLite-backed service layer that allows agents to:
- Start an audit session and obtain a stable
session_id. - Record thought records against that session (reusing
thought_recordwith an addedsession_idargument — see §4.2). - Finalize the session by computing a Merkle root over every record’s
hashfield (ordered byrowid ASC) and persisting that root. - Retrieve the finalized root + record count + timestamp.
The Merkle primitives (buildMerkleTree, …) are consumed from
src/domains/proof/merkle.ts unchanged. This task is pure glue: SQLite storage, input
validation, and MCP registration.
2. Tool surface
2.1 audit_session_start
Input schema (Zod)
z.object({
intent: z.string().min(1, 'intent must not be empty'),
task_id: z.string().min(1).optional(),
session_id: z.string().min(1).optional(), // escape hatch for reproducible tests
})
Output (data payload — the α middleware wraps in {ok, data, ...})
{
session_id: string; // UUID v4 by default (or the injected one)
intent: string;
task_id: string | null;
started_at: string; // ISO-8601 UTC
}
Semantics
- Generates a fresh UUID v4 via
randomUUID()unlesssession_idis explicitly provided (tests lean on this). - Inserts exactly one row into
audit_sessions. - Idempotence: a second call with the same provided
session_idtriggers a PK UNIQUE violation, mapped toERR_SESSION_EXISTS. task_idis opaque — no FK, no validation againsttaskstable (soft-delete semantics there make FK unreliable). Policy decision: caller owns referential integrity.
2.2 merkle_finalize
Input schema (Zod)
z.object({
session_id: z.string().min(1),
})
Output (success)
{
session_id: string;
root: string; // 64-char lowercase hex
record_count: number; // non-negative
finalized_at: string; // ISO-8601 UTC
}
Semantics
- Verifies the session exists (reject with
ERR_SESSION_NOT_FOUNDotherwise). - Verifies the session has NOT already been finalized (reject with
ERR_ALREADY_FINALIZEDotherwise — PK UNIQUE onmerkle_roots.session_id). - Reads all thought records
WHERE session_id = ? ORDER BY rowid ASC. - Rejects (AC4) with
ERR_NO_RECORDSif count == 0. The CLAUDE.md §7 ordering rule “thought_record MUST precede merkle_finalize” is enforced here. - Builds a Merkle tree via
buildMerkleTree(hashes)from P0.8.1. - Inserts a row into
merkle_roots (session_id, root, record_count, finalized_at). - Returns the row.
All four steps run inside a single db.transaction so concurrent finalize races
cannot double-commit.
2.3 merkle_root
Input schema (Zod)
z.object({
session_id: z.string().min(1),
})
Output (success)
{
session_id: string;
root: string;
record_count: number;
finalized_at: string;
}
Semantics
SELECT * FROM merkle_roots WHERE session_id = ?.- If no row: return
{ok:false, error:{code: 'ERR_NOT_FINALIZED', ...}}. - Otherwise return the row as-is.
3. Error codes
| Code | Source | Meaning |
|---|---|---|
INVALID_PARAMS |
α middleware stage 2 (Zod) | Malformed input (empty string, wrong type) |
ERR_SESSION_EXISTS |
audit_session_start |
Provided session_id already exists |
ERR_SESSION_NOT_FOUND |
merkle_finalize |
Session never started (no audit_sessions row) |
ERR_ALREADY_FINALIZED |
merkle_finalize |
Session already has a merkle_roots row |
ERR_NO_RECORDS |
merkle_finalize |
Session has zero thought records (AC4) |
ERR_NOT_FINALIZED |
merkle_root |
Session exists but has no merkle_roots row |
Tasks and the writeback layer already use the {ok:false, error:{code,message,...}}
envelope pattern (src/domains/tasks/repository.ts:700-710). We follow the same.
4. Invariants
4.1 Deterministic root
For the same set of thought-record hashes (any insertion order), merkle_finalize
MUST always produce the same root. This is guaranteed by the
sortLeaves: true + sortPairs: true options already baked into
buildMerkleTree. We add one test that re-inserts the same 5 hashes into a
different session in a different order and asserts equal roots.
4.2 session_id column on thought_records
Migration 005 adds a nullable session_id TEXT column. Existing rows receive NULL.
The P0.7.2 API is unchanged — it does not require session_id and does not read it.
New consumers (including tests within this task) may insert thought records with a
session_id by using raw SQL, because createThoughtRecord does not expose the
field. This task does NOT change createThoughtRecord — extending ζ is out of scope.
4.3 session_id is NOT part of the hash
Per P0.7.1 contract §3 and the hash-subset rule at src/domains/trail/schema.ts:170-188,
only {id, type, task_id, content, timestamp, prev_hash} participate in the hash.
Adding session_id to the record does NOT modify the hash. The Merkle tree built by
merkle_finalize uses each record’s existing hash value, not a re-computation
that includes session_id.
4.4 Transport-first / Phase 2 DB open
All three handlers lazy-resolve getDb() at call time. This matches the pattern
established in src/domains/trail/repository.ts:377. It is safe because MCP tool
calls cannot arrive until after server.connect(transport) completes, which precedes
initDb() in the Phase 0 startup contract (src/startup.ts).
4.5 ORDER BY rowid ASC, not created_at
All queries that iterate records within a session use ORDER BY rowid ASC. This is
the P0.7.2 lesson memorialized in src/domains/trail/repository.ts:220 — SQLite’s
rowid is the only ordering primitive immune to ISO-8601 millisecond ties on fast CI.
4.6 Purity
The repository-style functions (startAuditSession, finalizeMerkleRoot,
getMerkleRoot) take a Database handle as first argument. No module-level DB
state. No env reads at import time. Pure enough to test against :memory: with no
singleton setup.
4.7 No side-effects at import time
Importing src/tools/merkle.ts MUST NOT open a DB, read a file, or mutate
process.*. This matches every other tool module in the repo.
5. Acceptance criteria → tests mapping
| AC | Test (in src/__tests__/tools/merkle.test.ts) |
|---|---|
AC1: merkle_finalize builds Merkle tree of last N unfinalized records, stores root |
describe('merkle_finalize — happy path') + describe('merkle_finalize — persistence') |
AC2: merkle_root returns current root hash + record count + timestamp |
describe('merkle_root — happy path') — asserts all three fields |
AC3: audit_session_start creates audit session record, returns session_id |
describe('audit_session_start') — asserts UUID v4 shape + row insertion |
| AC4: Finalization must happen AFTER final thought record (errors if no thought_record) | it('rejects session with zero records') in describe('merkle_finalize — AC4') |
| AC5: Test: finalize 5-record session → root matches manual computation | describe('merkle_finalize — 5-record manual computation') — the acceptance-headline test |
Additional invariants tested (beyond AC):
- I1 (§4.1): Deterministic root under shuffled-insert order.
- I2 (§4.2):
thought_records.session_idcolumn is added + indexed after migration 005 runs. - I3 (§4.3): ζ hash chain remains valid after
session_idis populated. - I4: Already-finalized session rejects second finalize.
- I5:
merkle_rooton unfinalized session returnsERR_NOT_FINALIZED. - I6: Tool-registration duplicate guard — second
registerMerkleTools(ctx)throws.
6. Migration 005 (exact body)
-- 006_eta — η Proof Store tables + ζ session linkage (P0.8.3).
--
-- Introduces:
-- audit_sessions — intent/task_id/started_at, one row per audit_session_start call
-- merkle_roots — one row per finalized session; PK is session_id
--
-- Also adds:
-- thought_records.session_id (nullable TEXT) — grouping key for finalize
-- idx_trail_session on thought_records(session_id, rowid) — finalize lookup path
--
-- Does NOT foreign-key between tables. audit_sessions.task_id is opaque (β soft-delete
-- makes FK unreliable). merkle_roots.session_id is not FK'd to audit_sessions because
-- we enforce referential integrity in handler logic inside a single db.transaction.
CREATE TABLE audit_sessions (
session_id TEXT PRIMARY KEY,
intent TEXT NOT NULL,
task_id TEXT,
started_at TEXT NOT NULL
);
CREATE INDEX idx_audit_sessions_task ON audit_sessions(task_id);
CREATE TABLE merkle_roots (
session_id TEXT PRIMARY KEY,
root TEXT NOT NULL,
record_count INTEGER NOT NULL,
finalized_at TEXT NOT NULL
);
ALTER TABLE thought_records ADD COLUMN session_id TEXT;
CREATE INDEX idx_trail_session ON thought_records(session_id);
(The rowid is an implicit tie-breaker in SQLite and cannot appear in a
CREATE INDEX column list. A session-scoped ORDER BY rowid ASC query still
uses this index for the WHERE session_id = ? filter and returns rows in
rowid order via SQLite’s natural table-scan fallback, which is fine for our
per-session row counts.)
Migration is idempotent per initDb()’s user_version gating. Running it twice is a
no-op — the runner checks PRAGMA user_version and skips already-applied migrations.
7. Non-goals
- No audit session end tool in P0.8.3. The task-breakdown demands only
audit_session_start+merkle_finalize+merkle_root. Ending a session is implicit: finalize seals it. A dedicatedaudit_session_endtool belongs to a later task. - No session metadata mutation. Once started,
audit_sessions.intentandstarted_atare read-only at this layer. - No proof retrieval at the MCP layer yet —
generateProof/verifyProofare called only inside the test suite. A futuremerkle_prooftool will wire them up (not in scope here). - No multi-session finalize, no “last N unfinalized records” cross-session roll-up. One session → one root.
- No HTTP / REST surface — this is an MCP tool registration only.
- No modifications to P0.8.1 (
src/domains/proof/merkle.ts) or P0.8.2 (src/domains/proof/retention.ts— not shipped, but claimed by the other agent).
8. Writeback note (CLAUDE.md §7 compliance)
The writeback protocol requires a final thought_record BEFORE merkle_finalize
for proof-grade tasks. The three tools shipped here are exactly the primitives that
make that protocol executable. The tools themselves are not proof-grade: they are
infrastructure. However, this task’s own writeback (PR body + final message)
follows the ordering rule: the summary is recorded BEFORE any merkle_finalize
call on this session.
Since no live MCP client is running during executor work, the thought_record
actually materializes as the PR body + the summary in the final report — same
as R75 Wave E tasks.