R89.A behavioral contract — session_id end-to-end binding

1. Goal statement

After R89.A, the canonical proof-grade chain documented in CLAUDE.md §7 returns a real Merkle root rather than ERR_NO_RECORDS:

audit_session_start({ intent }) → { session_id: S, … }
thought_record({ session_id: S, task_id: T, type, agent_id, content }) → { … }
…
thought_record({ session_id: S, task_id: T, type: 'reflection', … })   # final
merkle_finalize({ session_id: S }) → { session_id: S, root, record_count: N, finalized_at }

where N is the number of thought_record calls observed against session_id = S since the start of the session.

2. Definitions

  • Session id (S): opaque string ≥ 1 char, minted by audit_session_start or supplied as a deterministic test seam.
  • Bound record: a thought_records row whose session_id column equals the session id supplied at createThoughtRecord time.
  • Unbound record: a thought_records row whose session_id is NULL — the only shape createThoughtRecord produced before R89.A, and the shape rows produced before this fix retain forever.

3. Invariants preserved (must not regress)

# Invariant Source
I1 The hash subset is exactly {id, type, task_id, content, timestamp, prev_hash}. session_id is NOT included. src/domains/trail/schema.ts:172-187
I2 The pinned-hash snapshot at repository.test.ts:186-196 returns 6a2f9597f563d5515cfa69891a51806d0f93bfbe222997d3ba37c365ceee3f1a after R89.A. src/__tests__/domains/trail/repository.test.ts:186
I3 audit_verify_chain continues to ignore session_id — verification is per task_id, not per session. src/domains/trail/verifier.ts (no edit)
I4 Per-task_id prev_hash chain linkage is unchanged. session_id does NOT join into the latestPrev lookup. src/domains/trail/repository.ts:219-224
I5 Pre-existing rows (NULL session_id) are NOT migrated, backfilled, or modified. 006_eta.sql:54 ALTER is nullable.
I6 The MCP envelope shape returned by thought_record does not lose any field. The data payload before this change has 8 fields; after this change it has 9 (the new field is optional and may be null). Phase 0 MCP envelope contract.
I7 task_update writeback gate (enforceWriteback at writeback.ts:97) continues to function — it counts any thought_records row keyed by task_id, regardless of session_id. CLAUDE.md §7.

4. New invariants introduced

# Invariant Test that proves it
N1 When session_id is supplied to createThoughtRecord, the inserted row’s thought_records.session_id column equals the supplied value. repository.test.ts — new “createThoughtRecord — session_id” describe block.
N2 When session_id is omitted from createThoughtRecord, the inserted row’s thought_records.session_id column is NULL. repository.test.ts — same describe block, omitted-case.
N3 When session_id is the empty string, Zod rejects the call (no row inserted). repository.test.ts — same describe block, empty-string-case.
N4 merkle_finalize({session_id: S}) returns a real Merkle root (no ERR_NO_RECORDS) when N ≥ 1 records have been created via createThoughtRecord({session_id: S, …}). merkle.test.ts — new “end-to-end session_id binding” describe block.
N5 record_count returned by merkle_finalize equals N — the number of createThoughtRecord calls bound to that session id. merkle.test.ts — same describe block.
N6 Records bound to different session_id values do NOT cross-contaminate merkle_finalize’s collection. Calling finalize on session A returns a tree over A’s records only. merkle.test.ts — same describe block, two-sessions case.

5. Behavior matrix

Caller of createThoughtRecord supplies DB column session_id Counted by merkle_finalize({session_id: S})?
session_id: S 'S' (string) Yes — provided the filter argument is also S.
session_id: undefined or property absent NULL No (since WHERE session_id = ? does not match NULL even if ? is the empty string).
session_id: '' (rejected by Zod) N/A — no row inserted.

This matrix preserves the donor-era behaviour for any caller that does not opt in (legacy task_id-only thought_records keep working) while making the session-bound path actually session-bound.

6. API signature delta

// before R89.A
export interface CreateThoughtRecordInput {
  readonly type: ThoughtType;
  readonly task_id: string;
  readonly agent_id: string;
  readonly content: string;
}

// after R89.A
export interface CreateThoughtRecordInput {
  readonly type: ThoughtType;
  readonly task_id: string;
  readonly agent_id: string;
  readonly content: string;
  readonly session_id?: string;          // ← new, optional
}
// Zod schema — additive, all existing inputs still parse.
const CreateThoughtRecordInputSchema = z.object({
  type: z.enum(THOUGHT_TYPES),
  task_id: z.string().min(1),
  agent_id: z.string().min(1),
  content: z.string().min(1, 'thought content must be non-empty'),
  session_id: z.string().min(1).optional(),    // ← new
});
-- INSERT — 10 columns, 10 binds.
INSERT INTO thought_records
  (id, type, task_id, agent_id, content, timestamp, prev_hash, hash, created_at, session_id)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?);

Return shape: the function returns the same 8 hash-chain fields plus a session_id field whose value is the supplied string or null. The canonical ThoughtRecord type stays 8-field; the function’s return-type annotation widens.

7. Non-breaking guarantee

Every existing caller of thought_record / createThoughtRecord continues to work without modification. The new field is opt-in. Existing tests pass without edits except where the return-shape assertion uses toEqual({…strict 8 fields}) — in that case the new field is null in the return, and an additive expectation may be needed. The plan in §4 of the packet enumerates the exact test files surveyed and confirms none of them asserts strict-shape equality on the return value (they use toHaveProperty/scalar checks).

8. Acceptance criteria

R89.A is done when:

  1. A1createThoughtRecord accepts an optional session_id and writes it to the session_id column when supplied; writes NULL otherwise.
  2. A2 — All seven invariants in §3 hold (verified by existing test pass
    • the pinned-hash snapshot test).
  3. A3 — The six new invariants in §4 are proven by new tests.
  4. A4npm run build, npm run lint, npm test all pass against the feature branch.
  5. A5 — Total test count strictly increases (no test is deleted; no pre-existing test is silenced).

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.