P0.7.1 — Step 1 Audit
Inventory of the worktree against the task spec for P0.7.1 ζ Hash-Chained Record Schema (ζ Decision Trail task group, first ζ task, Wave C parallel dispatch). Scope: what already exists that the new schema + hasher must integrate with, what the donor extraction prescribes, and what is absent.
Baseline: worktree E:/AMS/.worktrees/claude/p0-7-1-trail-schema/ at commit 3ebbe419 (P0.2.2 SQLite init merged as PR #122, on top of P0.2.1 MCP server + P0.4.1 modes).
§1. Surface being added
Targets this task creates:
src/domains/trail/schema.ts— new module. Holds theTHOUGHT_TYPEStuple,ThoughtTypeunion,ThoughtRecordSchema(Zod),ThoughtRecordtype,ZERO_HASHconstant (64-zero hex),canonicalize(value)function, andcomputeHash(record)function. Does not yet exist.src/__tests__/trail-schema.test.ts— new test file, co-located with other Phase-0 tests (deviation from spec’stests/domains/trail/schema.test.ts— see §3).src/domains/trail/— new directory (no siblings yet undersrc/domains/).
A worktree scan confirms absence of the authoring targets:
ls src/domains/→ “No such file or directory”grep -rn "trail\|thought\|chain_hash\|prev_hash\|ZERO_HASH" src/→ zero matchesgrep -rn "trail-schema\|trail\.test" src/__tests__/→ zero matches
This is a greenfield module — P0.7.1 authors both source and test files in a single PR. No DB table is created by this task (P0.7.2 owns 004_zeta.sql), no MCP tool is registered (P0.7.2 owns thought_record, P0.7.3 owns audit_verify_chain). P0.7.1 ships only the pure schema + hashing primitives.
§2. Adjacent code that the new module must integrate with
2a. src/config.ts (95 lines — P0.1.4, commit 3bd154a7)
Phase-0 environment wrapper. The trail schema does not consume config — it is a pure, synchronous, stateless primitive. The schema module does not read process.env, has no AMS_* guard of its own (transitively enforced by whoever imports it alongside config), and has no eager side-effects.
The loadConfig(env) pure-factory pattern is nevertheless the load-bearing style to mirror: computeHash(record) is analogously pure — takes a record object, returns a hex string, never touches globals.
2b. src/modes.ts (186 lines — P0.4.1, commit a64d7349)
Runtime mode enum. Not consumed by the trail schema. The style is load-bearing: the RUNTIME_MODES tuple as const + derived RuntimeMode union is the pattern the THOUGHT_TYPES tuple + ThoughtType union must mirror exactly. Frozen singletons, pure factories, no eager work — all three properties carry over.
2c. src/server.ts (559 LOC — P0.2.1, commit 40cd679d)
MCP server bootstrap. The trail schema is not consumed by this task (P0.7.2 will register the thought_record tool against the server). However, the server’s AuditSink seam (lines around registerAuditSink per the P0.2.1 contract) is where ζ will plug in at P0.7.2. The schema exported here MUST be compatible with both JSON-RPC wire transport (Zod-validatable, no functions, no Date objects in the validated surface) and with the AuditSink interface (all record fields serialisable as JSON.stringify-able primitives).
timestampisstring(ISO 8601), NOT aDateobject — JSON wire compatibility.prev_hashandhasharestring(hex), NOTBuffer— same reason.contentisstringfor Phase 0 — the canonicalizer, however, handles arbitrary JSON-serialisable shapes so the surface can later accept structured content without schema churn (see §4).
2d. src/db/index.ts (P0.2.2, commit 3ebbe419)
SQLite wrapper. Not consumed by P0.7.1. P0.7.2 will author 004_zeta.sql and build a repository against this module. P0.7.1 is pure — no DB handle, no persistence, no side effects.
2e. src/__tests__/config.test.ts + modes.test.ts
Test-style precedent. Relevant patterns:
- Pure-factory test style, no
process.envmutation for this module (no env consumed). - Tests live in
src/__tests__/perjest.config.tsline 15roots: ['<rootDir>/src']. - No
jest.isolateModulesAsync(P0.1.4 documents the zod-v3 locale-cache bug under ts-jest ESM). The trail schema uses zod but has no eager validation — no module-load isolation required. - Coverage floor: Wave A convention is 100% stmt/func/line and ≥90% branch per
jest.config.tslines 41-46 collecting fromsrc/**/*.ts. The packet targets 100% across the board for this tiny surface.
2f. package.json
Declares zod@^3.23.8 (line 30) as a runtime dependency. The schema file will import { z } from 'zod' — no new dependencies. Node’s built-in crypto module (node:crypto) provides SHA-256 — no third-party hasher.
No additions to package.json#files (that array lists shipped assets; src/domains/trail/schema.ts is a src/ file bundled by tsc into dist/).
§3. Spec reconciliation — load-bearing deviations from the task-spec
The task-breakdown file docs/guides/implementation/task-breakdown.md §P0.7.1 has one deviation from Wave A’s conventions; the dispatch prompt confirms the override.
Deviation 1 — test file location
- Spec (line 323):
tests/domains/trail/schema.test.ts. - Wave A convention:
src/__tests__/<name>.test.tsperjest.config.tsline 15roots: ['<rootDir>/src']. Jest will not discover files outsidesrc/. - Decision:
src/__tests__/trail-schema.test.ts. Confirmed by observing that all existing tests (config.test.ts,modes.test.ts,server.test.ts,smoke.test.ts,db-init.test.ts) live insrc/__tests__/. Matches the dispatch prompt’s override.
§4. Hash-input subset — literal spec reading
The spec line is load-bearing and nuanced:
hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))
The hash input is a subset of the record fields, explicitly:
- Included:
id,type,task_id,content,timestamp,prev_hash(6 fields) - Excluded:
agent_id(metadata),hash(the output itself — would be circular)
This diverges from the donor AMS zeta-decision-trail-extraction.md algorithm (lines 22-34), which computes content_hash = SHA256(content) then chain_hash = SHA256(content_hash + parent_chain_hash). The Colibri Phase-0 design hashes a canonical-JSON projection of six fields in one step — a simpler, flatter algorithm suited to a single hash column. The donor algorithm used two hash columns (content_hash, chain_hash); Colibri uses one. The spec’s intent is integrity of the chain (id + prev_hash + payload), not independent content-hashing.
Rationale for excluding agent_id: agent_id is operational metadata (who authored this thought) rather than a chain-integrity input. Excluding it keeps the hash stable if a record is later rewritten with a corrected agent_id — which is desirable for a provenance correction but would otherwise cascade a chain-break through every subsequent record. The chain still proves “this id + this content at this timestamp follows from this prev_hash”; agent_id is recorded alongside but not anchored.
Rationale for excluding hash: self-reference is impossible. This is mechanical, not design.
The contract must document this subset decision explicitly; the test matrix must include a case where two records differing only in agent_id produce identical hashes (positive assertion of the exclusion).
§5. Canonical-JSON algorithm — load-bearing correctness property
The spec requires:
- Sorted keys at every nesting depth
- No whitespace
- Deterministic across platforms/runs
Algorithm (from dispatch prompt §”Hashing detail”):
- Input: a JSON-serialisable value.
- If the value is an object: recursively sort keys, then rebuild as a new object in sorted order. Recurse into each value.
- If the value is an array: preserve insertion order, recurse into each element.
- If the value is a primitive (string, number, boolean, null): return as-is.
- Serialise via
JSON.stringify(sortedValue)— no second argument, no indent argument. Default output has no whitespace.
Edge cases the test matrix must cover:
- Nested object:
{ b: { d: 1, c: 2 }, a: 3 }→ must sort both outer (a,b) and inner (c,d). Even though Phase-0contentisstring, future-proofing the canonicalizer against nested shapes is load-bearing. - Two records authored with keys in different insertion orders (e.g.
{ id, type, ... }vs{ type, id, ... }) must hash identically. - Array order preservation:
[3, 1, 2]must canonicalize to[3,1,2], not sorted. nullvalues: preserved. The canonicalizer does NOT strip null.undefined: not a valid JSON value; behavior documented as “undefined is skipped during stringify” — same as nativeJSON.stringify. Tests pin this behavior.
Determinism test: call canonicalize twice on the same input (with a different object-literal author order) and assert identical output strings. Call computeHash twice on the same record and assert identical hex strings.
§6. Acceptance criteria mapping
Spec acceptance criteria (from docs/guides/implementation/task-breakdown.md §P0.7.1 and the dispatch prompt) map to audit facts as follows:
| Criterion | Source | Audit observation |
|---|---|---|
Record schema: { id, type, task_id, agent_id, content, timestamp, prev_hash, hash } (8 fields) |
spec line 325 | ThoughtRecordSchema Zod object with all 8 fields. id/task_id/agent_id/content/timestamp = z.string().min(1), prev_hash/hash = z.string().length(64), type = z.enum(THOUGHT_TYPES). |
4 valid types: plan | analysis | decision | reflection |
spec line 326 | THOUGHT_TYPES = ['plan', 'analysis', 'decision', 'reflection'] as const. Matches donor extraction lines 42-47. |
hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash})) |
spec line 327 | computeHash receives the 6-field subset (not agent_id, not hash). Dispatch prompt §”Hashing detail” confirms literal reading. |
| Canonical JSON: sorted keys, no whitespace (deterministic) | spec line 328 | Recursive key-sort, JSON.stringify(sorted) with no second arg. Tested cross-insertion-order and cross-nested-object. |
First record: prev_hash = "0000...0000" (64 zeros) |
spec line 329 | export const ZERO_HASH = '0'.repeat(64); Tested ZERO_HASH.length === 64 and matches /^0{64}$/. |
| Two records with identical inputs produce identical hashes | spec line 330 | Determinism test: computeHash(input) === computeHash(input); also cross-insertion-order test for canonicalizer. |
All 6 acceptance criteria have direct test paths. No spec-level ambiguity remains.
§7. Donor genealogy (reference only)
docs/reference/extractions/zeta-decision-trail-extraction.md (extracted R45 from AMS src/controllers/thought.js) documents:
- 4 record types (lines 42-47):
plan | analysis | decision | reflection. Colibri reproduces exactly.observationandhypothesisfrom earlier AMS drafts are NOT valid. - Two-hash algorithm (lines 22-34):
content_hash = SHA256(content),chain_hash = SHA256(content_hash + parent.chain_hash). Colibri simplifies to a singlehashover a canonical-JSON projection — see §4. - 64-zero genesis hash (line 29): preserved as
ZERO_HASH. - Record schema (lines 51-67) — donor has
session_id,parent_id,content,content_hash,chain_hash,metadata{task_id, agent_id, ...},created_at. Colibri flattenstask_idandagent_idto top-level fields and renamesparent.chain_hash→prev_hash,content_hash+chain_hash→hash. Nosession_id(P0.7.2 may re-introduce), nometadatasubobject, noparent_id(linearised viaprev_hash), nocreated_at(renamedtimestamp). - Tool surface (lines 158-165): 6 donor tools (
thought_record,thought_get,thought_tree,thought_trail,thought_verify,thought_search). Phase-0 ζ ships onlythought_record+audit_verify_chainper ADR-004 R74.5. Neither lands in P0.7.1 (P0.7.2 and P0.7.3 respectively). - FTS5 search (lines 117-135): donor feature. Not in Phase 0.
- Thought trees / branching (lines 140-154): donor feature. Phase-0 uses a strict linear chain via
prev_hash.
None of the donor source lands in P0.7.1. The algorithm is a full rewrite; only the type enum and the ZERO_HASH genesis constant carry over verbatim.
§8. Parallel-wave collision check
Wave C dispatches four tasks in parallel:
| Task | Owner files | Collision with P0.7.1? |
|---|---|---|
| P0.2.3 two-phase startup | src/server.ts (edits) |
No — P0.7.1 does not touch server.ts. |
| P0.3.1 β state machine | src/domains/tasks/* |
No — sibling under src/domains/, disjoint directory. |
| P0.6.1 ε skill schema | src/domains/skills/* |
No — sibling under src/domains/, disjoint directory. |
| P0.7.1 (this task) | src/domains/trail/* + src/__tests__/trail-schema.test.ts |
n/a |
No shared edits. The src/domains/ parent directory must exist (any of P0.3.1/P0.6.1/P0.7.1 may create it — mkdir -p is idempotent; no merge conflict risk given git tracks files not directories).
Shared infra files the dispatch prompt explicitly forbids this task from touching:
src/server.ts,src/db/*,src/config.ts,src/modes.ts— confirmed not modified.package.json,jest.config.ts,tsconfig.json— no new deps, no config changes; confirmed not modified.src/domains/tasks/*,src/domains/skills/*— confirmed not touched.
§9. Baseline verification
$ git rev-parse HEAD
3ebbe419 (P0.2.2 SQLite init merged as PR #122)
$ git log -1 --format=%s
feat(p0-2-2): SQLite init — better-sqlite3 WAL+FK + PRAGMA user_version migrations (#122)
$ node --version
v20.x (from CI matrix — P0.1.3 established Node 20 floor)
$ cat package.json | grep -E "zod|better-sqlite3"
"better-sqlite3": "^11.5.0",
"zod": "^3.23.8",
All prerequisites from the dependency chain (P0.1.1 scaffolding, P0.1.2 Jest ESM, P0.1.4 config, P0.2.2 SQLite init) are present at HEAD. zod@^3.23.8 is available; node:crypto is a Node built-in (no package dependency).
§10. Exit criteria for Step 1
- All target files catalogued as absent in §1.
- Adjacent integration surfaces documented in §2.
- Spec vs packet deviation resolved in §3 (one: test file location).
- Hash-input subset rationale captured in §4.
- Canonical-JSON algorithm + test edge cases captured in §5.
- Acceptance criteria mapped to test paths in §6.
- Donor genealogy acknowledged as reference-only in §7.
- Parallel-wave collision check completed in §8.
- Baseline commit + toolchain verified in §9.
Ready to proceed to Step 2 (contract).