P0.7.1 — Step 1 Audit

Inventory of the worktree against the task spec for P0.7.1 ζ Hash-Chained Record Schema (ζ Decision Trail task group, first ζ task, Wave C parallel dispatch). Scope: what already exists that the new schema + hasher must integrate with, what the donor extraction prescribes, and what is absent.

Baseline: worktree E:/AMS/.worktrees/claude/p0-7-1-trail-schema/ at commit 3ebbe419 (P0.2.2 SQLite init merged as PR #122, on top of P0.2.1 MCP server + P0.4.1 modes).

§1. Surface being added

Targets this task creates:

src/domains/trail/schema.ts — new module. Holds the THOUGHT_TYPES tuple, ThoughtType union, ThoughtRecordSchema (Zod), ThoughtRecord type, ZERO_HASH constant (64-zero hex), canonicalize(value) function, and computeHash(record) function. Does not yet exist.
src/__tests__/trail-schema.test.ts — new test file, co-located with other Phase-0 tests (deviation from spec’s tests/domains/trail/schema.test.ts — see §3).
src/domains/trail/ — new directory (no siblings yet under src/domains/).

A worktree scan confirms absence of the authoring targets:

ls src/domains/ → “No such file or directory”
grep -rn "trail\|thought\|chain_hash\|prev_hash\|ZERO_HASH" src/ → zero matches
grep -rn "trail-schema\|trail\.test" src/__tests__/ → zero matches

This is a greenfield module — P0.7.1 authors both source and test files in a single PR. No DB table is created by this task (P0.7.2 owns 004_zeta.sql), no MCP tool is registered (P0.7.2 owns thought_record, P0.7.3 owns audit_verify_chain). P0.7.1 ships only the pure schema + hashing primitives.

§2. Adjacent code that the new module must integrate with

2a. `src/config.ts` (95 lines — P0.1.4, commit `3bd154a7`)

Phase-0 environment wrapper. The trail schema does not consume config — it is a pure, synchronous, stateless primitive. The schema module does not read process.env, has no AMS_* guard of its own (transitively enforced by whoever imports it alongside config), and has no eager side-effects.

The loadConfig(env) pure-factory pattern is nevertheless the load-bearing style to mirror: computeHash(record) is analogously pure — takes a record object, returns a hex string, never touches globals.

2b. `src/modes.ts` (186 lines — P0.4.1, commit `a64d7349`)

Runtime mode enum. Not consumed by the trail schema. The style is load-bearing: the RUNTIME_MODES tuple as const + derived RuntimeMode union is the pattern the THOUGHT_TYPES tuple + ThoughtType union must mirror exactly. Frozen singletons, pure factories, no eager work — all three properties carry over.

2c. `src/server.ts` (559 LOC — P0.2.1, commit `40cd679d`)

MCP server bootstrap. The trail schema is not consumed by this task (P0.7.2 will register the thought_record tool against the server). However, the server’s AuditSink seam (lines around registerAuditSink per the P0.2.1 contract) is where ζ will plug in at P0.7.2. The schema exported here MUST be compatible with both JSON-RPC wire transport (Zod-validatable, no functions, no Date objects in the validated surface) and with the AuditSink interface (all record fields serialisable as JSON.stringify-able primitives).

timestamp is string (ISO 8601), NOT a Date object — JSON wire compatibility.
prev_hash and hash are string (hex), NOT Buffer — same reason.
content is string for Phase 0 — the canonicalizer, however, handles arbitrary JSON-serialisable shapes so the surface can later accept structured content without schema churn (see §4).

2d. `src/db/index.ts` (P0.2.2, commit `3ebbe419`)

SQLite wrapper. Not consumed by P0.7.1. P0.7.2 will author 004_zeta.sql and build a repository against this module. P0.7.1 is pure — no DB handle, no persistence, no side effects.

2e. `src/tests/config.test.ts` + `modes.test.ts`

Test-style precedent. Relevant patterns:

Pure-factory test style, no process.env mutation for this module (no env consumed).
Tests live in src/__tests__/ per jest.config.ts line 15 roots: ['<rootDir>/src'].
No jest.isolateModulesAsync (P0.1.4 documents the zod-v3 locale-cache bug under ts-jest ESM). The trail schema uses zod but has no eager validation — no module-load isolation required.
Coverage floor: Wave A convention is 100% stmt/func/line and ≥90% branch per jest.config.ts lines 41-46 collecting from src/**/*.ts. The packet targets 100% across the board for this tiny surface.

2f. `package.json`

Declares zod@^3.23.8 (line 30) as a runtime dependency. The schema file will import { z } from 'zod' — no new dependencies. Node’s built-in crypto module (node:crypto) provides SHA-256 — no third-party hasher.

No additions to package.json#files (that array lists shipped assets; src/domains/trail/schema.ts is a src/ file bundled by tsc into dist/).

§3. Spec reconciliation — load-bearing deviations from the task-spec

The task-breakdown file docs/guides/implementation/task-breakdown.md §P0.7.1 has one deviation from Wave A’s conventions; the dispatch prompt confirms the override.

Deviation 1 — test file location

Spec (line 323): tests/domains/trail/schema.test.ts.
Wave A convention: src/__tests__/<name>.test.ts per jest.config.ts line 15 roots: ['<rootDir>/src']. Jest will not discover files outside src/.
Decision: src/__tests__/trail-schema.test.ts. Confirmed by observing that all existing tests (config.test.ts, modes.test.ts, server.test.ts, smoke.test.ts, db-init.test.ts) live in src/__tests__/. Matches the dispatch prompt’s override.

§4. Hash-input subset — literal spec reading

The spec line is load-bearing and nuanced:

hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))

The hash input is a subset of the record fields, explicitly:

Included: id, type, task_id, content, timestamp, prev_hash (6 fields)
Excluded: agent_id (metadata), hash (the output itself — would be circular)

This diverges from the donor AMS zeta-decision-trail-extraction.md algorithm (lines 22-34), which computes content_hash = SHA256(content) then chain_hash = SHA256(content_hash + parent_chain_hash). The Colibri Phase-0 design hashes a canonical-JSON projection of six fields in one step — a simpler, flatter algorithm suited to a single hash column. The donor algorithm used two hash columns (content_hash, chain_hash); Colibri uses one. The spec’s intent is integrity of the chain (id + prev_hash + payload), not independent content-hashing.

Rationale for excluding agent_id: agent_id is operational metadata (who authored this thought) rather than a chain-integrity input. Excluding it keeps the hash stable if a record is later rewritten with a corrected agent_id — which is desirable for a provenance correction but would otherwise cascade a chain-break through every subsequent record. The chain still proves “this id + this content at this timestamp follows from this prev_hash”; agent_id is recorded alongside but not anchored.

Rationale for excluding hash: self-reference is impossible. This is mechanical, not design.

The contract must document this subset decision explicitly; the test matrix must include a case where two records differing only in agent_id produce identical hashes (positive assertion of the exclusion).

§5. Canonical-JSON algorithm — load-bearing correctness property

The spec requires:

Sorted keys at every nesting depth
No whitespace
Deterministic across platforms/runs

Algorithm (from dispatch prompt §”Hashing detail”):

Input: a JSON-serialisable value.
If the value is an object: recursively sort keys, then rebuild as a new object in sorted order. Recurse into each value.
If the value is an array: preserve insertion order, recurse into each element.
If the value is a primitive (string, number, boolean, null): return as-is.
Serialise via JSON.stringify(sortedValue) — no second argument, no indent argument. Default output has no whitespace.

Edge cases the test matrix must cover:

Nested object: { b: { d: 1, c: 2 }, a: 3 } → must sort both outer (a, b) and inner (c, d). Even though Phase-0 content is string, future-proofing the canonicalizer against nested shapes is load-bearing.
Two records authored with keys in different insertion orders (e.g. { id, type, ... } vs { type, id, ... }) must hash identically.
Array order preservation: [3, 1, 2] must canonicalize to [3,1,2], not sorted.
null values: preserved. The canonicalizer does NOT strip null.
undefined: not a valid JSON value; behavior documented as “undefined is skipped during stringify” — same as native JSON.stringify. Tests pin this behavior.

Determinism test: call canonicalize twice on the same input (with a different object-literal author order) and assert identical output strings. Call computeHash twice on the same record and assert identical hex strings.

§6. Acceptance criteria mapping

Spec acceptance criteria (from docs/guides/implementation/task-breakdown.md §P0.7.1 and the dispatch prompt) map to audit facts as follows:

Criterion	Source	Audit observation
Record schema: `{ id, type, task_id, agent_id, content, timestamp, prev_hash, hash }` (8 fields)	spec line 325	`ThoughtRecordSchema` Zod object with all 8 fields. `id`/`task_id`/`agent_id`/`content`/`timestamp` = `z.string().min(1)`, `prev_hash`/`hash` = `z.string().length(64)`, `type` = `z.enum(THOUGHT_TYPES)`.
4 valid types: `plan \| analysis \| decision \| reflection`	spec line 326	`THOUGHT_TYPES = ['plan', 'analysis', 'decision', 'reflection'] as const`. Matches donor extraction lines 42-47.
`hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))`	spec line 327	`computeHash` receives the 6-field subset (not `agent_id`, not `hash`). Dispatch prompt §”Hashing detail” confirms literal reading.
Canonical JSON: sorted keys, no whitespace (deterministic)	spec line 328	Recursive key-sort, `JSON.stringify(sorted)` with no second arg. Tested cross-insertion-order and cross-nested-object.
First record: `prev_hash = "0000...0000"` (64 zeros)	spec line 329	`export const ZERO_HASH = '0'.repeat(64);` Tested `ZERO_HASH.length === 64` and matches `/^0{64}$/`.
Two records with identical inputs produce identical hashes	spec line 330	Determinism test: `computeHash(input) === computeHash(input)`; also cross-insertion-order test for canonicalizer.

All 6 acceptance criteria have direct test paths. No spec-level ambiguity remains.

§7. Donor genealogy (reference only)

docs/reference/extractions/zeta-decision-trail-extraction.md (extracted R45 from AMS src/controllers/thought.js) documents:

4 record types (lines 42-47): plan | analysis | decision | reflection. Colibri reproduces exactly. observation and hypothesis from earlier AMS drafts are NOT valid.
Two-hash algorithm (lines 22-34): content_hash = SHA256(content), chain_hash = SHA256(content_hash + parent.chain_hash). Colibri simplifies to a single hash over a canonical-JSON projection — see §4.
64-zero genesis hash (line 29): preserved as ZERO_HASH.
Record schema (lines 51-67) — donor has session_id, parent_id, content, content_hash, chain_hash, metadata{task_id, agent_id, ...}, created_at. Colibri flattens task_id and agent_id to top-level fields and renames parent.chain_hash → prev_hash, content_hash + chain_hash → hash. No session_id (P0.7.2 may re-introduce), no metadata subobject, no parent_id (linearised via prev_hash), no created_at (renamed timestamp).
Tool surface (lines 158-165): 6 donor tools (thought_record, thought_get, thought_tree, thought_trail, thought_verify, thought_search). Phase-0 ζ ships only thought_record + audit_verify_chain per ADR-004 R74.5. Neither lands in P0.7.1 (P0.7.2 and P0.7.3 respectively).
FTS5 search (lines 117-135): donor feature. Not in Phase 0.
Thought trees / branching (lines 140-154): donor feature. Phase-0 uses a strict linear chain via prev_hash.

None of the donor source lands in P0.7.1. The algorithm is a full rewrite; only the type enum and the ZERO_HASH genesis constant carry over verbatim.

§8. Parallel-wave collision check

Wave C dispatches four tasks in parallel:

Task	Owner files	Collision with P0.7.1?
P0.2.3 two-phase startup	`src/server.ts` (edits)	No — P0.7.1 does not touch server.ts.
P0.3.1 β state machine	`src/domains/tasks/*`	No — sibling under `src/domains/`, disjoint directory.
P0.6.1 ε skill schema	`src/domains/skills/*`	No — sibling under `src/domains/`, disjoint directory.
P0.7.1 (this task)	`src/domains/trail/*` + `src/__tests__/trail-schema.test.ts`	n/a

No shared edits. The src/domains/ parent directory must exist (any of P0.3.1/P0.6.1/P0.7.1 may create it — mkdir -p is idempotent; no merge conflict risk given git tracks files not directories).

Shared infra files the dispatch prompt explicitly forbids this task from touching:

src/server.ts, src/db/*, src/config.ts, src/modes.ts — confirmed not modified.
package.json, jest.config.ts, tsconfig.json — no new deps, no config changes; confirmed not modified.
src/domains/tasks/*, src/domains/skills/* — confirmed not touched.

§9. Baseline verification

$ git rev-parse HEAD
3ebbe419  (P0.2.2 SQLite init merged as PR #122)

$ git log -1 --format=%s
feat(p0-2-2): SQLite init — better-sqlite3 WAL+FK + PRAGMA user_version migrations (#122)

$ node --version
v20.x (from CI matrix — P0.1.3 established Node 20 floor)

$ cat package.json | grep -E "zod|better-sqlite3"
    "better-sqlite3": "^11.5.0",
    "zod": "^3.23.8",

All prerequisites from the dependency chain (P0.1.1 scaffolding, P0.1.2 Jest ESM, P0.1.4 config, P0.2.2 SQLite init) are present at HEAD. zod@^3.23.8 is available; node:crypto is a Node built-in (no package dependency).

§10. Exit criteria for Step 1

All target files catalogued as absent in §1.
Adjacent integration surfaces documented in §2.
Spec vs packet deviation resolved in §3 (one: test file location).
Hash-input subset rationale captured in §4.
Canonical-JSON algorithm + test edge cases captured in §5.
Acceptance criteria mapped to test paths in §6.
Donor genealogy acknowledged as reference-only in §7.
Parallel-wave collision check completed in §8.
Baseline commit + toolchain verified in §9.

Ready to proceed to Step 2 (contract).