R89.A behavioral contract — session_id end-to-end binding

1. Goal statement

After R89.A, the canonical proof-grade chain documented in CLAUDE.md §7 returns a real Merkle root rather than ERR_NO_RECORDS:

audit_session_start({ intent }) → { session_id: S, … }
thought_record({ session_id: S, task_id: T, type, agent_id, content }) → { … }
…
thought_record({ session_id: S, task_id: T, type: 'reflection', … })   # final
merkle_finalize({ session_id: S }) → { session_id: S, root, record_count: N, finalized_at }

where N is the number of thought_record calls observed against session_id = S since the start of the session.

2. Definitions

Session id (S): opaque string ≥ 1 char, minted by audit_session_start or supplied as a deterministic test seam.
Bound record: a thought_records row whose session_id column equals the session id supplied at createThoughtRecord time.
Unbound record: a thought_records row whose session_id is NULL — the only shape createThoughtRecord produced before R89.A, and the shape rows produced before this fix retain forever.

3. Invariants preserved (must not regress)

#	Invariant	Source
I1	The hash subset is exactly `{id, type, task_id, content, timestamp, prev_hash}`. `session_id` is NOT included.	`src/domains/trail/schema.ts:172-187`
I2	The pinned-hash snapshot at `repository.test.ts:186-196` returns `6a2f9597f563d5515cfa69891a51806d0f93bfbe222997d3ba37c365ceee3f1a` after R89.A.	`src/__tests__/domains/trail/repository.test.ts:186`
I3	`audit_verify_chain` continues to ignore `session_id` — verification is per `task_id`, not per session.	`src/domains/trail/verifier.ts` (no edit)
I4	Per-`task_id` `prev_hash` chain linkage is unchanged. `session_id` does NOT join into the `latestPrev` lookup.	`src/domains/trail/repository.ts:219-224`
I5	Pre-existing rows (NULL `session_id`) are NOT migrated, backfilled, or modified.	`006_eta.sql:54` ALTER is nullable.
I6	The MCP envelope shape returned by `thought_record` does not lose any field. The `data` payload before this change has 8 fields; after this change it has 9 (the new field is optional and may be `null`).	Phase 0 MCP envelope contract.
I7	`task_update` writeback gate (`enforceWriteback` at `writeback.ts:97`) continues to function — it counts any `thought_records` row keyed by `task_id`, regardless of `session_id`.	CLAUDE.md §7.

4. New invariants introduced

#	Invariant	Test that proves it
N1	When `session_id` is supplied to `createThoughtRecord`, the inserted row’s `thought_records.session_id` column equals the supplied value.	`repository.test.ts` — new “createThoughtRecord — session_id” describe block.
N2	When `session_id` is omitted from `createThoughtRecord`, the inserted row’s `thought_records.session_id` column is `NULL`.	`repository.test.ts` — same describe block, omitted-case.
N3	When `session_id` is the empty string, Zod rejects the call (no row inserted).	`repository.test.ts` — same describe block, empty-string-case.
N4	`merkle_finalize({session_id: S})` returns a real Merkle root (no `ERR_NO_RECORDS`) when N ≥ 1 records have been created via `createThoughtRecord({session_id: S, …})`.	`merkle.test.ts` — new “end-to-end session_id binding” describe block.
N5	`record_count` returned by `merkle_finalize` equals N — the number of `createThoughtRecord` calls bound to that session id.	`merkle.test.ts` — same describe block.
N6	Records bound to different `session_id` values do NOT cross-contaminate `merkle_finalize`’s collection. Calling finalize on session A returns a tree over A’s records only.	`merkle.test.ts` — same describe block, two-sessions case.

5. Behavior matrix

Caller of `createThoughtRecord` supplies	DB column `session_id`	Counted by `merkle_finalize({session_id: S})`?
`session_id: S`	`'S'` (string)	Yes — provided the filter argument is also `S`.
`session_id: undefined` or property absent	`NULL`	No (since `WHERE session_id = ?` does not match NULL even if `?` is the empty string).
`session_id: ''`	(rejected by Zod)	N/A — no row inserted.

This matrix preserves the donor-era behaviour for any caller that does not opt in (legacy task_id-only thought_records keep working) while making the session-bound path actually session-bound.

6. API signature delta

// before R89.A
export interface CreateThoughtRecordInput {
  readonly type: ThoughtType;
  readonly task_id: string;
  readonly agent_id: string;
  readonly content: string;
}

// after R89.A
export interface CreateThoughtRecordInput {
  readonly type: ThoughtType;
  readonly task_id: string;
  readonly agent_id: string;
  readonly content: string;
  readonly session_id?: string;          // ← new, optional
}

// Zod schema — additive, all existing inputs still parse.
const CreateThoughtRecordInputSchema = z.object({
  type: z.enum(THOUGHT_TYPES),
  task_id: z.string().min(1),
  agent_id: z.string().min(1),
  content: z.string().min(1, 'thought content must be non-empty'),
  session_id: z.string().min(1).optional(),    // ← new
});

-- INSERT — 10 columns, 10 binds.
INSERT INTO thought_records
  (id, type, task_id, agent_id, content, timestamp, prev_hash, hash, created_at, session_id)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?);

Return shape: the function returns the same 8 hash-chain fields plus a session_id field whose value is the supplied string or null. The canonical ThoughtRecord type stays 8-field; the function’s return-type annotation widens.

7. Non-breaking guarantee

Every existing caller of thought_record / createThoughtRecord continues to work without modification. The new field is opt-in. Existing tests pass without edits except where the return-shape assertion uses toEqual({…strict 8 fields}) — in that case the new field is null in the return, and an additive expectation may be needed. The plan in §4 of the packet enumerates the exact test files surveyed and confirms none of them asserts strict-shape equality on the return value (they use toHaveProperty/scalar checks).

8. Acceptance criteria

R89.A is done when:

A1 — createThoughtRecord accepts an optional session_id and writes it to the session_id column when supplied; writes NULL otherwise.
A2 — All seven invariants in §3 hold (verified by existing test pass
- the pinned-hash snapshot test).
A3 — The six new invariants in §4 are proven by new tests.
A4 — npm run build, npm run lint, npm test all pass against the feature branch.
A5 — Total test count strictly increases (no test is deleted; no pre-existing test is silenced).