Contract: P0.8.3 η Merkle Root Finalization Tools

Task: P0.8.3 — three new MCP tools: audit_session_start, merkle_finalize, merkle_root Depends on: P0.8.1 (primitives) + migration 006_eta.sql (earned here) Consumed by: writeback protocol (CLAUDE.md §7); future P0.7.3 audit_verify_chain.


1. Scope

A thin SQLite-backed service layer that allows agents to:

  1. Start an audit session and obtain a stable session_id.
  2. Record thought records against that session (reusing thought_record with an added session_id argument — see §4.2).
  3. Finalize the session by computing a Merkle root over every record’s hash field (ordered by rowid ASC) and persisting that root.
  4. Retrieve the finalized root + record count + timestamp.

The Merkle primitives (buildMerkleTree, …) are consumed from src/domains/proof/merkle.ts unchanged. This task is pure glue: SQLite storage, input validation, and MCP registration.


2. Tool surface

2.1 audit_session_start

Input schema (Zod)

z.object({
  intent: z.string().min(1, 'intent must not be empty'),
  task_id: z.string().min(1).optional(),
  session_id: z.string().min(1).optional(), // escape hatch for reproducible tests
})

Output (data payload — the α middleware wraps in {ok, data, ...})

{
  session_id: string;  // UUID v4 by default (or the injected one)
  intent: string;
  task_id: string | null;
  started_at: string;  // ISO-8601 UTC
}

Semantics

  • Generates a fresh UUID v4 via randomUUID() unless session_id is explicitly provided (tests lean on this).
  • Inserts exactly one row into audit_sessions.
  • Idempotence: a second call with the same provided session_id triggers a PK UNIQUE violation, mapped to ERR_SESSION_EXISTS.
  • task_id is opaque — no FK, no validation against tasks table (soft-delete semantics there make FK unreliable). Policy decision: caller owns referential integrity.

2.2 merkle_finalize

Input schema (Zod)

z.object({
  session_id: z.string().min(1),
})

Output (success)

{
  session_id: string;
  root: string;         // 64-char lowercase hex
  record_count: number; // non-negative
  finalized_at: string; // ISO-8601 UTC
}

Semantics

  1. Verifies the session exists (reject with ERR_SESSION_NOT_FOUND otherwise).
  2. Verifies the session has NOT already been finalized (reject with ERR_ALREADY_FINALIZED otherwise — PK UNIQUE on merkle_roots.session_id).
  3. Reads all thought records WHERE session_id = ? ORDER BY rowid ASC.
  4. Rejects (AC4) with ERR_NO_RECORDS if count == 0. The CLAUDE.md §7 ordering rule “thought_record MUST precede merkle_finalize” is enforced here.
  5. Builds a Merkle tree via buildMerkleTree(hashes) from P0.8.1.
  6. Inserts a row into merkle_roots (session_id, root, record_count, finalized_at).
  7. Returns the row.

All four steps run inside a single db.transaction so concurrent finalize races cannot double-commit.

2.3 merkle_root

Input schema (Zod)

z.object({
  session_id: z.string().min(1),
})

Output (success)

{
  session_id: string;
  root: string;
  record_count: number;
  finalized_at: string;
}

Semantics

  • SELECT * FROM merkle_roots WHERE session_id = ?.
  • If no row: return {ok:false, error:{code: 'ERR_NOT_FINALIZED', ...}}.
  • Otherwise return the row as-is.

3. Error codes

Code Source Meaning
INVALID_PARAMS α middleware stage 2 (Zod) Malformed input (empty string, wrong type)
ERR_SESSION_EXISTS audit_session_start Provided session_id already exists
ERR_SESSION_NOT_FOUND merkle_finalize Session never started (no audit_sessions row)
ERR_ALREADY_FINALIZED merkle_finalize Session already has a merkle_roots row
ERR_NO_RECORDS merkle_finalize Session has zero thought records (AC4)
ERR_NOT_FINALIZED merkle_root Session exists but has no merkle_roots row

Tasks and the writeback layer already use the {ok:false, error:{code,message,...}} envelope pattern (src/domains/tasks/repository.ts:700-710). We follow the same.


4. Invariants

4.1 Deterministic root

For the same set of thought-record hashes (any insertion order), merkle_finalize MUST always produce the same root. This is guaranteed by the sortLeaves: true + sortPairs: true options already baked into buildMerkleTree. We add one test that re-inserts the same 5 hashes into a different session in a different order and asserts equal roots.

4.2 session_id column on thought_records

Migration 005 adds a nullable session_id TEXT column. Existing rows receive NULL. The P0.7.2 API is unchanged — it does not require session_id and does not read it. New consumers (including tests within this task) may insert thought records with a session_id by using raw SQL, because createThoughtRecord does not expose the field. This task does NOT change createThoughtRecord — extending ζ is out of scope.

4.3 session_id is NOT part of the hash

Per P0.7.1 contract §3 and the hash-subset rule at src/domains/trail/schema.ts:170-188, only {id, type, task_id, content, timestamp, prev_hash} participate in the hash. Adding session_id to the record does NOT modify the hash. The Merkle tree built by merkle_finalize uses each record’s existing hash value, not a re-computation that includes session_id.

4.4 Transport-first / Phase 2 DB open

All three handlers lazy-resolve getDb() at call time. This matches the pattern established in src/domains/trail/repository.ts:377. It is safe because MCP tool calls cannot arrive until after server.connect(transport) completes, which precedes initDb() in the Phase 0 startup contract (src/startup.ts).

4.5 ORDER BY rowid ASC, not created_at

All queries that iterate records within a session use ORDER BY rowid ASC. This is the P0.7.2 lesson memorialized in src/domains/trail/repository.ts:220 — SQLite’s rowid is the only ordering primitive immune to ISO-8601 millisecond ties on fast CI.

4.6 Purity

The repository-style functions (startAuditSession, finalizeMerkleRoot, getMerkleRoot) take a Database handle as first argument. No module-level DB state. No env reads at import time. Pure enough to test against :memory: with no singleton setup.

4.7 No side-effects at import time

Importing src/tools/merkle.ts MUST NOT open a DB, read a file, or mutate process.*. This matches every other tool module in the repo.


5. Acceptance criteria → tests mapping

AC Test (in src/__tests__/tools/merkle.test.ts)
AC1: merkle_finalize builds Merkle tree of last N unfinalized records, stores root describe('merkle_finalize — happy path') + describe('merkle_finalize — persistence')
AC2: merkle_root returns current root hash + record count + timestamp describe('merkle_root — happy path') — asserts all three fields
AC3: audit_session_start creates audit session record, returns session_id describe('audit_session_start') — asserts UUID v4 shape + row insertion
AC4: Finalization must happen AFTER final thought record (errors if no thought_record) it('rejects session with zero records') in describe('merkle_finalize — AC4')
AC5: Test: finalize 5-record session → root matches manual computation describe('merkle_finalize — 5-record manual computation') — the acceptance-headline test

Additional invariants tested (beyond AC):

  • I1 (§4.1): Deterministic root under shuffled-insert order.
  • I2 (§4.2): thought_records.session_id column is added + indexed after migration 005 runs.
  • I3 (§4.3): ζ hash chain remains valid after session_id is populated.
  • I4: Already-finalized session rejects second finalize.
  • I5: merkle_root on unfinalized session returns ERR_NOT_FINALIZED.
  • I6: Tool-registration duplicate guard — second registerMerkleTools(ctx) throws.

6. Migration 005 (exact body)

-- 006_eta — η Proof Store tables + ζ session linkage (P0.8.3).
--
-- Introduces:
--   audit_sessions  — intent/task_id/started_at, one row per audit_session_start call
--   merkle_roots    — one row per finalized session; PK is session_id
--
-- Also adds:
--   thought_records.session_id (nullable TEXT) — grouping key for finalize
--   idx_trail_session on thought_records(session_id, rowid) — finalize lookup path
--
-- Does NOT foreign-key between tables. audit_sessions.task_id is opaque (β soft-delete
-- makes FK unreliable). merkle_roots.session_id is not FK'd to audit_sessions because
-- we enforce referential integrity in handler logic inside a single db.transaction.

CREATE TABLE audit_sessions (
  session_id TEXT PRIMARY KEY,
  intent     TEXT NOT NULL,
  task_id    TEXT,
  started_at TEXT NOT NULL
);

CREATE INDEX idx_audit_sessions_task ON audit_sessions(task_id);

CREATE TABLE merkle_roots (
  session_id    TEXT PRIMARY KEY,
  root          TEXT NOT NULL,
  record_count  INTEGER NOT NULL,
  finalized_at  TEXT NOT NULL
);

ALTER TABLE thought_records ADD COLUMN session_id TEXT;
CREATE INDEX idx_trail_session ON thought_records(session_id);

(The rowid is an implicit tie-breaker in SQLite and cannot appear in a CREATE INDEX column list. A session-scoped ORDER BY rowid ASC query still uses this index for the WHERE session_id = ? filter and returns rows in rowid order via SQLite’s natural table-scan fallback, which is fine for our per-session row counts.)

Migration is idempotent per initDb()’s user_version gating. Running it twice is a no-op — the runner checks PRAGMA user_version and skips already-applied migrations.


7. Non-goals

  • No audit session end tool in P0.8.3. The task-breakdown demands only audit_session_start + merkle_finalize + merkle_root. Ending a session is implicit: finalize seals it. A dedicated audit_session_end tool belongs to a later task.
  • No session metadata mutation. Once started, audit_sessions.intent and started_at are read-only at this layer.
  • No proof retrieval at the MCP layer yet — generateProof / verifyProof are called only inside the test suite. A future merkle_proof tool will wire them up (not in scope here).
  • No multi-session finalize, no “last N unfinalized records” cross-session roll-up. One session → one root.
  • No HTTP / REST surface — this is an MCP tool registration only.
  • No modifications to P0.8.1 (src/domains/proof/merkle.ts) or P0.8.2 (src/domains/proof/retention.ts — not shipped, but claimed by the other agent).

8. Writeback note (CLAUDE.md §7 compliance)

The writeback protocol requires a final thought_record BEFORE merkle_finalize for proof-grade tasks. The three tools shipped here are exactly the primitives that make that protocol executable. The tools themselves are not proof-grade: they are infrastructure. However, this task’s own writeback (PR body + final message) follows the ordering rule: the summary is recorded BEFORE any merkle_finalize call on this session.

Since no live MCP client is running during executor work, the thought_record actually materializes as the PR body + the summary in the final report — same as R75 Wave E tasks.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.