P0.7.1 — Step 2 Behavioral Contract
Binding contract for P0.7.1 ζ Hash-Chained Record Schema — the pure schema + hashing primitives that later P0.7.2 (thought_record CRUD) and P0.7.3 (audit_verify_chain) will consume. This contract is implementation-binding; deviations require an amended contract.
§1. Purity guarantees
src/domains/trail/schema.ts is a pure module. This is load-bearing.
- No eager side-effects at import time. Loading the module must not read
process.env, open files, callcrypto.randomUUID(), or construct aDateobject. It may declare frozen constants (THOUGHT_TYPES,ZERO_HASH) and pure functions. - No console output. Not
console.log, notconsole.error. Errors propagate viathrow. - No filesystem I/O. Not
fs.*. - No network I/O. Not
fetch, nothttp.*. - No MCP registration. Does not import from
@modelcontextprotocol/sdk. The tool that consumes this module (thought_record) is P0.7.2’s territory. - No DB access. Does not import from
src/db/index.ts. Persistence is P0.7.2’s territory. - All exported functions are referentially transparent. Same input → same output, forever.
computeHash(record)is deterministic given the exportedcanonicalize.canonicalize(value)is deterministic given JSON-serialisable input.
Corollary: the module has no state. There is no singleton instance, no cache, no lazy init. Every call to computeHash redoes the canonicalization + SHA-256 from scratch. This is acceptable because the surface is tiny and node:crypto is fast.
§2. Exported surface
The module exports exactly seven identifiers. Each is contracted below.
2a. THOUGHT_TYPES
export const THOUGHT_TYPES = ['plan', 'analysis', 'decision', 'reflection'] as const;
- Type:
readonly ['plan', 'analysis', 'decision', 'reflection']. - Frozen tuple. Declared
as constso the union type derives cleanly. - Order is canonical:
plan | analysis | decision | reflection. Matches the donor extraction atdocs/reference/extractions/zeta-decision-trail-extraction.mdlines 42-47 and the task spec atdocs/guides/implementation/task-breakdown.mdline 326. Iteration order MAY be relied upon by callers (e.g. help-text output). - Adding a new type is a breaking change — it rotates the union and invalidates every downstream Zod parse.
2b. ThoughtType
export type ThoughtType = (typeof THOUGHT_TYPES)[number];
- Equivalent to
'plan' | 'analysis' | 'decision' | 'reflection'. - Always derived from the tuple, never hard-coded. If the tuple changes, the type changes with it.
2c. ZERO_HASH
export const ZERO_HASH: string = '0'.repeat(64);
- Type:
string. - Value: exactly 64 ASCII
'0'characters. - Purpose: the
prev_hashof the first record in a chain (the genesis record). - Invariant:
ZERO_HASH.length === 64and/^0{64}$/.test(ZERO_HASH) === true. Tests pin this. - Lowercase hex. SHA-256 digests in this module are all lowercase hex;
ZERO_HASHaligns. - This is the only string value for which
prev_hashmay equalZERO_HASH. P0.7.3’s chain verifier will check that exactly one record in a chain usesZERO_HASHasprev_hash(the genesis record). P0.7.1 does not enforce this; the Zod schema accepts any 64-char hex.
2d. ThoughtRecordSchema
export const ThoughtRecordSchema = z.object({
id: z.string().min(1),
type: z.enum(THOUGHT_TYPES),
task_id: z.string().min(1),
agent_id: z.string().min(1),
content: z.string(),
timestamp: z.string().min(1),
prev_hash: z.string().length(64),
hash: z.string().length(64),
});
- Validates shape only. Does NOT verify that
hashwas correctly computed over the other fields — that is P0.7.3’s job (audit_verify_chain). content: z.string()— NOT.min(1). An empty-string content is a valid (if pointless) thought; the chain algorithm must handle it. The test matrix includes empty-string content.timestamp: z.string().min(1)— Zod does not validate ISO 8601 format at this layer. Format validation is the caller’s responsibility (e.g. P0.7.2’s CRUD will construct timestamps vianew Date().toISOString()and trust the round-trip). This keeps the schema decoupled from date-format choice.prev_hashandhasharez.string().length(64). Format is NOT validated (no.regex(/^[0-9a-f]{64}$/)). Rationale: tightening to lowercase-hex here would duplicate logic that lives incomputeHash(which produces lowercase hex), and would reject legitimately-parsed records if a future encoder switches to uppercase. The 64-char length is the only format invariant at the schema layer; content-hash verification is P0.7.3’s territory.
2e. ThoughtRecord
export type ThoughtRecord = z.infer<typeof ThoughtRecordSchema>;
- Inferred from the Zod schema to keep type and runtime validation in lockstep.
- Equivalent to:
{
id: string;
type: ThoughtType;
task_id: string;
agent_id: string;
content: string;
timestamp: string;
prev_hash: string; // 64-char hex
hash: string; // 64-char hex
}
2f. canonicalize(value)
export function canonicalize(value: unknown): string;
- Produces the deterministic, whitespace-free JSON serialisation of
value. - Algorithm (recursive):
- If
valueisnull, return'null'. - If
valueis a primitive (boolean, number, string), returnJSON.stringify(value). - If
valueis an array, return'[' + items.map(canonicalize).join(',') + ']'whereitems = value. - If
valueis a plain object, sort its own enumerable keys ascending (ASCII code-point order, matchingArray.prototype.sort()default), then return'{' + entries.map(([k, v]) => JSON.stringify(k) + ':' + canonicalize(v)).join(',') + '}'. - If
valueisundefinedat the top level, return'undefined'-semantic handling: the schema-level types never containundefined, but for future-proofing,undefinedkeys in objects are skipped (matching nativeJSON.stringifybehavior for object values).
- If
- Determinism: for two calls with JSON-equal inputs, output is identical. Across Node versions, platforms, and process restarts.
- Arrays preserve insertion order (a JSON array is ordered; sorting would corrupt meaning).
- The output is valid JSON (parseable by
JSON.parse). - No whitespace.
JSON.stringify(obj)with no second/third argument already produces no whitespace; the recursive path above replicates that. - Not exported as a wire format. Callers should not assume
canonicalize(value)===JSON.stringify(value); the point is deterministic sorted output, not compatibility with stringify. - Throws
TypeErrorif the input contains a circular reference, abigint, asymbol, or a function value (same as nativeJSON.stringify’s failure modes). Tests cover at least the circular-reference case to pin this.
2g. computeHash(record)
export function computeHash(record: {
id: string;
type: ThoughtType;
task_id: string;
content: string;
timestamp: string;
prev_hash: string;
}): string;
- Input: the 6-field subset
{id, type, task_id, content, timestamp, prev_hash}. NOT a fullThoughtRecord—agent_idandhashare excluded by design. - Output: a 64-character lowercase-hex SHA-256 digest.
- Algorithm:
const input = {
id: record.id,
type: record.type,
task_id: record.task_id,
content: record.content,
timestamp: record.timestamp,
prev_hash: record.prev_hash,
};
const canonicalJson = canonicalize(input);
return createHash('sha256').update(canonicalJson, 'utf8').digest('hex');
- The function constructs the input object with explicit field selection — even if the caller passes a full
ThoughtRecord, only the six subset fields flow into the hash. This is the critical exclusion invariant: two records differing only inagent_id(or only in thehashfield itself) MUST produce identicalcomputeHashoutputs. Tests enforce this. 'utf8'encoding is explicit (Node’s default is also utf8, but we declare it for cross-platform robustness).createHashcomes fromnode:crypto. No third-party hash library.
§3. Hash-input subset — the agent_id exclusion
The spec:
hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))
excludes agent_id from the hash input. This is deliberate.
Chain integrity inputs:
id— unique record identifier. Part of the chain’s identity.type— classifies the thought. Part of the record’s semantic payload.task_id— scopes the chain to a task. Prevents cross-task forgery.content— the actual thought text. The payload being anchored.timestamp— when the thought was recorded. Temporal witness.prev_hash— links to the previous record. The chain property.
Excluded:
agent_id— metadata about the author. Not a chain-integrity input. A record rewritten with a correctedagent_id(e.g. fixing a wrong sub-agent attribution) should NOT cascade a chain-break through every subsequent record. The chain still proves “this id + this content at this timestamp follows from this prev_hash”;agent_idis recorded alongside but not anchored.hash— the output itself. Including it is impossible (self-reference).
The invariant is tested directly: two records that differ only in agent_id must produce identical hashes.
§4. Canonical JSON — determinism contract
The canonicalizer enforces these invariants:
- Sorted keys at every depth. Object keys are sorted ASCII-ascending at every nesting level. Nested objects (e.g. if Phase 1+ extends
contentto be an object) get the same treatment. - Insertion-order-agnostic.
canonicalize({b: 1, a: 2}) === canonicalize({a: 2, b: 1}). Tested. - No whitespace. Output string has no spaces, newlines, or tabs. Exactly matches
JSON.stringify(sortedTree)with no indent argument. - Array order preserved.
canonicalize([3,1,2]) === '[3,1,2]', NOT'[1,2,3]'. Arrays are ordered data structures. - Primitives unchanged.
canonicalize(42) === '42',canonicalize(true) === 'true',canonicalize(null) === 'null',canonicalize('hi') === '"hi"'. - Cross-platform determinism. Tests assert that the same canonicalized string is produced by two calls in the same process; CI (Linux) and local (Windows) runs both pass the same snapshot-style assertion (identical hash value from a fixed-input record).
4a. Cross-insertion-order test case
const a = { id: 'r1', type: 'plan', task_id: 't1', content: 'c', timestamp: 'ts', prev_hash: ZERO_HASH };
const b = { prev_hash: ZERO_HASH, timestamp: 'ts', content: 'c', task_id: 't1', type: 'plan', id: 'r1' };
expect(computeHash(a)).toBe(computeHash(b));
4b. Nested-object future-proofing test
const x = { b: { d: 1, c: 2 }, a: 3 };
const y = { a: 3, b: { c: 2, d: 1 } };
expect(canonicalize(x)).toBe(canonicalize(y));
expect(canonicalize(x)).toBe('{"a":3,"b":{"c":2,"d":1}}');
Even though Phase-0 content is string, the canonicalizer is tested against nested shapes to lock the recursion invariant before P0.7.2 ships CRUD.
§5. Error handling
5a. canonicalize
- Circular reference: throws
TypeError(inherits nativeJSON.stringifybehavior). Detected by the native stringify call on the final built string, or by an explicit traversal check. Implementation MAY useJSON.stringifywith a sorted-key replacer function, which handles circularity natively; or a manual walk with aWeakSetguard. bigint: throwsTypeErrorwith message including'BigInt'. NativeJSON.stringifythrows"Do not know how to serialize a BigInt".- Function / symbol values in objects: skipped (same as native
JSON.stringify). A test covers this to pin the behavior.
5b. computeHash
- Propagates any error from
canonicalizeunchanged. - Does not validate input against
ThoughtRecordSchema— callers should pre-validate if they want Zod’s error messages. This keepscomputeHashfast in the hot path (CRUD layer will validate once, then hash).
5c. Zod validation
ThoughtRecordSchema.parse(raw)throwsZodErroron shape mismatch. Standard Zod behavior.ThoughtRecordSchema.safeParse(raw)returns{success, data | error}. Callers choose.
§6. Wire format compatibility
The ThoughtRecord shape is wire-safe:
- All fields are
string— JSON-RPC primitive. NoDateobjects, noBuffer, nobigint. - Trivial
JSON.stringify(record)round-trips viaThoughtRecordSchema.parse(JSON.parse(...)). prev_hashandhashare lowercase hex strings of length 64. No binary encoding.- P0.7.2’s
thought_recordMCP tool can useThoughtRecorddirectly as its tool-result schema with no transformation.
§7. Non-goals
The following are explicitly out of scope for P0.7.1 and are reserved for later ζ tasks:
- Persistence. No
INSERT INTO thought_records. P0.7.2 ownssrc/domains/trail/repository.tsand004_zeta.sql. thought_recordMCP tool. No registration against the server. P0.7.2.audit_verify_chainMCP tool. No chain walk. P0.7.3.- Chain linking. No “find the latest record’s hash and use it as my prev_hash” helper. Callers supply
prev_hashexplicitly. P0.7.2 will add a helper. - ID generation. No
crypto.randomUUID()call. Callers supplyid. P0.7.2 will generate. - Timestamp generation. No
new Date().toISOString(). Callers supplytimestamp. P0.7.2 will generate. session_idfield. Donor had this (extraction line 58). Phase-0 ζ does not — P0.7.2 may re-introduce if needed, out of scope here.- Branching / thought trees. Donor had this (extraction lines 140-154). Phase-0 is a strict linear chain via
prev_hash.
§8. Test matrix (binding for Step 3/Step 4)
The Step 3 packet will expand this into exact test names. The contract pins the minimum set of tested behaviors:
| # | Behavior | Assertion |
|---|---|---|
| 1 | THOUGHT_TYPES tuple |
length 4, matches ['plan', 'analysis', 'decision', 'reflection'] |
| 2 | ZERO_HASH format |
length === 64, matches /^0{64}$/ |
| 3 | ThoughtRecordSchema accepts valid record |
.parse(valid) returns the record unchanged |
| 4 | ThoughtRecordSchema rejects each missing field |
.safeParse() returns {success: false} for each of 8 field-omission cases |
| 5 | ThoughtRecordSchema rejects invalid type |
.safeParse({...valid, type: 'observation'}) → failure |
| 6 | ThoughtRecordSchema rejects 63-char prev_hash / hash |
length-64 invariant enforced |
| 7 | ThoughtRecordSchema rejects 65-char prev_hash / hash |
length-64 invariant enforced |
| 8 | canonicalize primitives |
string, number, boolean, null each match JSON.stringify output |
| 9 | canonicalize sorts object keys |
canonicalize({b:1,a:2}) === '{"a":2,"b":1}' |
| 10 | canonicalize recurses into nested objects |
{b:{d:1,c:2},a:3} and {a:3,b:{c:2,d:1}} produce identical strings |
| 11 | canonicalize preserves array order |
[3,1,2] → '[3,1,2]' |
| 12 | canonicalize insertion-order-agnostic |
two objects with same keys in different orders produce identical strings |
| 13 | canonicalize skips undefined object values |
{a:1,b:undefined,c:3} → '{"a":1,"c":3}' |
| 14 | canonicalize throws on circular reference |
TypeError |
| 15 | computeHash produces 64-char lowercase hex |
/^[0-9a-f]{64}$/.test(out) |
| 16 | computeHash deterministic (same input twice) |
two calls return identical strings |
| 17 | computeHash excludes agent_id from the hash input |
two full records differing only in agent_id produce identical computeHash (using extra-field call) |
| 18 | computeHash sensitive to id change |
changing id changes the hash |
| 19 | computeHash sensitive to type change |
changing type changes the hash |
| 20 | computeHash sensitive to task_id change |
changing task_id changes the hash |
| 21 | computeHash sensitive to content change |
changing content changes the hash |
| 22 | computeHash sensitive to timestamp change |
changing timestamp changes the hash |
| 23 | computeHash sensitive to prev_hash change |
changing prev_hash changes the hash |
| 24 | computeHash insertion-order-agnostic |
passing {id,...} vs {prev_hash,...}-first yields same hash |
| 25 | computeHash first-record genesis |
record with prev_hash = ZERO_HASH hashes to a stable value (fixed-input snapshot) |
25 tests minimum. The packet may add edge cases; it cannot remove any.
§9. Coverage target
src/domains/trail/schema.ts— 100% statements, 100% functions, 100% lines, ≥95% branches.- Small branch-coverage leeway (≥95% instead of 100%) allowed for the
canonicalizetype-dispatch — specifically thefunction/symbol-value skip which may be dead-code under TS’s strict types but is worth keeping defensively for runtime safety. Test coverage target lands at 100%-stmt/100%-func/100%-line with some uncovered defensive branches allowed.
§10. Contract acceptance
- Purity guarantees explicit (§1).
- Seven exports catalogued with types + invariants (§2).
agent_idexclusion rationale documented (§3).- Canonical-JSON determinism invariants enumerated (§4).
- Error-handling contract for each function (§5).
- Wire format compatibility asserted (§6).
- Non-goals fenced (§7).
- 25-case minimum test matrix (§8).
- Coverage target set (§9).
Ready to proceed to Step 3 (packet).