P0.7.1 — Step 2 Behavioral Contract

Binding contract for P0.7.1 ζ Hash-Chained Record Schema — the pure schema + hashing primitives that later P0.7.2 (thought_record CRUD) and P0.7.3 (audit_verify_chain) will consume. This contract is implementation-binding; deviations require an amended contract.


§1. Purity guarantees

src/domains/trail/schema.ts is a pure module. This is load-bearing.

  • No eager side-effects at import time. Loading the module must not read process.env, open files, call crypto.randomUUID(), or construct a Date object. It may declare frozen constants (THOUGHT_TYPES, ZERO_HASH) and pure functions.
  • No console output. Not console.log, not console.error. Errors propagate via throw.
  • No filesystem I/O. Not fs.*.
  • No network I/O. Not fetch, not http.*.
  • No MCP registration. Does not import from @modelcontextprotocol/sdk. The tool that consumes this module (thought_record) is P0.7.2’s territory.
  • No DB access. Does not import from src/db/index.ts. Persistence is P0.7.2’s territory.
  • All exported functions are referentially transparent. Same input → same output, forever. computeHash(record) is deterministic given the exported canonicalize. canonicalize(value) is deterministic given JSON-serialisable input.

Corollary: the module has no state. There is no singleton instance, no cache, no lazy init. Every call to computeHash redoes the canonicalization + SHA-256 from scratch. This is acceptable because the surface is tiny and node:crypto is fast.


§2. Exported surface

The module exports exactly seven identifiers. Each is contracted below.

2a. THOUGHT_TYPES

export const THOUGHT_TYPES = ['plan', 'analysis', 'decision', 'reflection'] as const;
  • Type: readonly ['plan', 'analysis', 'decision', 'reflection'].
  • Frozen tuple. Declared as const so the union type derives cleanly.
  • Order is canonical: plan | analysis | decision | reflection. Matches the donor extraction at docs/reference/extractions/zeta-decision-trail-extraction.md lines 42-47 and the task spec at docs/guides/implementation/task-breakdown.md line 326. Iteration order MAY be relied upon by callers (e.g. help-text output).
  • Adding a new type is a breaking change — it rotates the union and invalidates every downstream Zod parse.

2b. ThoughtType

export type ThoughtType = (typeof THOUGHT_TYPES)[number];
  • Equivalent to 'plan' | 'analysis' | 'decision' | 'reflection'.
  • Always derived from the tuple, never hard-coded. If the tuple changes, the type changes with it.

2c. ZERO_HASH

export const ZERO_HASH: string = '0'.repeat(64);
  • Type: string.
  • Value: exactly 64 ASCII '0' characters.
  • Purpose: the prev_hash of the first record in a chain (the genesis record).
  • Invariant: ZERO_HASH.length === 64 and /^0{64}$/.test(ZERO_HASH) === true. Tests pin this.
  • Lowercase hex. SHA-256 digests in this module are all lowercase hex; ZERO_HASH aligns.
  • This is the only string value for which prev_hash may equal ZERO_HASH. P0.7.3’s chain verifier will check that exactly one record in a chain uses ZERO_HASH as prev_hash (the genesis record). P0.7.1 does not enforce this; the Zod schema accepts any 64-char hex.

2d. ThoughtRecordSchema

export const ThoughtRecordSchema = z.object({
  id: z.string().min(1),
  type: z.enum(THOUGHT_TYPES),
  task_id: z.string().min(1),
  agent_id: z.string().min(1),
  content: z.string(),
  timestamp: z.string().min(1),
  prev_hash: z.string().length(64),
  hash: z.string().length(64),
});
  • Validates shape only. Does NOT verify that hash was correctly computed over the other fields — that is P0.7.3’s job (audit_verify_chain).
  • content: z.string() — NOT .min(1). An empty-string content is a valid (if pointless) thought; the chain algorithm must handle it. The test matrix includes empty-string content.
  • timestamp: z.string().min(1) — Zod does not validate ISO 8601 format at this layer. Format validation is the caller’s responsibility (e.g. P0.7.2’s CRUD will construct timestamps via new Date().toISOString() and trust the round-trip). This keeps the schema decoupled from date-format choice.
  • prev_hash and hash are z.string().length(64). Format is NOT validated (no .regex(/^[0-9a-f]{64}$/)). Rationale: tightening to lowercase-hex here would duplicate logic that lives in computeHash (which produces lowercase hex), and would reject legitimately-parsed records if a future encoder switches to uppercase. The 64-char length is the only format invariant at the schema layer; content-hash verification is P0.7.3’s territory.

2e. ThoughtRecord

export type ThoughtRecord = z.infer<typeof ThoughtRecordSchema>;
  • Inferred from the Zod schema to keep type and runtime validation in lockstep.
  • Equivalent to:
{
  id: string;
  type: ThoughtType;
  task_id: string;
  agent_id: string;
  content: string;
  timestamp: string;
  prev_hash: string;  // 64-char hex
  hash: string;       // 64-char hex
}

2f. canonicalize(value)

export function canonicalize(value: unknown): string;
  • Produces the deterministic, whitespace-free JSON serialisation of value.
  • Algorithm (recursive):
    1. If value is null, return 'null'.
    2. If value is a primitive (boolean, number, string), return JSON.stringify(value).
    3. If value is an array, return '[' + items.map(canonicalize).join(',') + ']' where items = value.
    4. If value is a plain object, sort its own enumerable keys ascending (ASCII code-point order, matching Array.prototype.sort() default), then return '{' + entries.map(([k, v]) => JSON.stringify(k) + ':' + canonicalize(v)).join(',') + '}'.
    5. If value is undefined at the top level, return 'undefined'-semantic handling: the schema-level types never contain undefined, but for future-proofing, undefined keys in objects are skipped (matching native JSON.stringify behavior for object values).
  • Determinism: for two calls with JSON-equal inputs, output is identical. Across Node versions, platforms, and process restarts.
  • Arrays preserve insertion order (a JSON array is ordered; sorting would corrupt meaning).
  • The output is valid JSON (parseable by JSON.parse).
  • No whitespace. JSON.stringify(obj) with no second/third argument already produces no whitespace; the recursive path above replicates that.
  • Not exported as a wire format. Callers should not assume canonicalize(value) === JSON.stringify(value); the point is deterministic sorted output, not compatibility with stringify.
  • Throws TypeError if the input contains a circular reference, a bigint, a symbol, or a function value (same as native JSON.stringify’s failure modes). Tests cover at least the circular-reference case to pin this.

2g. computeHash(record)

export function computeHash(record: {
  id: string;
  type: ThoughtType;
  task_id: string;
  content: string;
  timestamp: string;
  prev_hash: string;
}): string;
  • Input: the 6-field subset {id, type, task_id, content, timestamp, prev_hash}. NOT a full ThoughtRecordagent_id and hash are excluded by design.
  • Output: a 64-character lowercase-hex SHA-256 digest.
  • Algorithm:
const input = {
  id: record.id,
  type: record.type,
  task_id: record.task_id,
  content: record.content,
  timestamp: record.timestamp,
  prev_hash: record.prev_hash,
};
const canonicalJson = canonicalize(input);
return createHash('sha256').update(canonicalJson, 'utf8').digest('hex');
  • The function constructs the input object with explicit field selection — even if the caller passes a full ThoughtRecord, only the six subset fields flow into the hash. This is the critical exclusion invariant: two records differing only in agent_id (or only in the hash field itself) MUST produce identical computeHash outputs. Tests enforce this.
  • 'utf8' encoding is explicit (Node’s default is also utf8, but we declare it for cross-platform robustness).
  • createHash comes from node:crypto. No third-party hash library.

§3. Hash-input subset — the agent_id exclusion

The spec:

hash = SHA-256(canonical_JSON({id, type, task_id, content, timestamp, prev_hash}))

excludes agent_id from the hash input. This is deliberate.

Chain integrity inputs:

  • id — unique record identifier. Part of the chain’s identity.
  • type — classifies the thought. Part of the record’s semantic payload.
  • task_id — scopes the chain to a task. Prevents cross-task forgery.
  • content — the actual thought text. The payload being anchored.
  • timestamp — when the thought was recorded. Temporal witness.
  • prev_hash — links to the previous record. The chain property.

Excluded:

  • agent_id — metadata about the author. Not a chain-integrity input. A record rewritten with a corrected agent_id (e.g. fixing a wrong sub-agent attribution) should NOT cascade a chain-break through every subsequent record. The chain still proves “this id + this content at this timestamp follows from this prev_hash”; agent_id is recorded alongside but not anchored.
  • hash — the output itself. Including it is impossible (self-reference).

The invariant is tested directly: two records that differ only in agent_id must produce identical hashes.


§4. Canonical JSON — determinism contract

The canonicalizer enforces these invariants:

  1. Sorted keys at every depth. Object keys are sorted ASCII-ascending at every nesting level. Nested objects (e.g. if Phase 1+ extends content to be an object) get the same treatment.
  2. Insertion-order-agnostic. canonicalize({b: 1, a: 2}) === canonicalize({a: 2, b: 1}). Tested.
  3. No whitespace. Output string has no spaces, newlines, or tabs. Exactly matches JSON.stringify(sortedTree) with no indent argument.
  4. Array order preserved. canonicalize([3,1,2]) === '[3,1,2]', NOT '[1,2,3]'. Arrays are ordered data structures.
  5. Primitives unchanged. canonicalize(42) === '42', canonicalize(true) === 'true', canonicalize(null) === 'null', canonicalize('hi') === '"hi"'.
  6. Cross-platform determinism. Tests assert that the same canonicalized string is produced by two calls in the same process; CI (Linux) and local (Windows) runs both pass the same snapshot-style assertion (identical hash value from a fixed-input record).

4a. Cross-insertion-order test case

const a = { id: 'r1', type: 'plan', task_id: 't1', content: 'c', timestamp: 'ts', prev_hash: ZERO_HASH };
const b = { prev_hash: ZERO_HASH, timestamp: 'ts', content: 'c', task_id: 't1', type: 'plan', id: 'r1' };
expect(computeHash(a)).toBe(computeHash(b));

4b. Nested-object future-proofing test

const x = { b: { d: 1, c: 2 }, a: 3 };
const y = { a: 3, b: { c: 2, d: 1 } };
expect(canonicalize(x)).toBe(canonicalize(y));
expect(canonicalize(x)).toBe('{"a":3,"b":{"c":2,"d":1}}');

Even though Phase-0 content is string, the canonicalizer is tested against nested shapes to lock the recursion invariant before P0.7.2 ships CRUD.


§5. Error handling

5a. canonicalize

  • Circular reference: throws TypeError (inherits native JSON.stringify behavior). Detected by the native stringify call on the final built string, or by an explicit traversal check. Implementation MAY use JSON.stringify with a sorted-key replacer function, which handles circularity natively; or a manual walk with a WeakSet guard.
  • bigint: throws TypeError with message including 'BigInt'. Native JSON.stringify throws "Do not know how to serialize a BigInt".
  • Function / symbol values in objects: skipped (same as native JSON.stringify). A test covers this to pin the behavior.

5b. computeHash

  • Propagates any error from canonicalize unchanged.
  • Does not validate input against ThoughtRecordSchema — callers should pre-validate if they want Zod’s error messages. This keeps computeHash fast in the hot path (CRUD layer will validate once, then hash).

5c. Zod validation

  • ThoughtRecordSchema.parse(raw) throws ZodError on shape mismatch. Standard Zod behavior.
  • ThoughtRecordSchema.safeParse(raw) returns {success, data | error}. Callers choose.

§6. Wire format compatibility

The ThoughtRecord shape is wire-safe:

  • All fields are string — JSON-RPC primitive. No Date objects, no Buffer, no bigint.
  • Trivial JSON.stringify(record) round-trips via ThoughtRecordSchema.parse(JSON.parse(...)).
  • prev_hash and hash are lowercase hex strings of length 64. No binary encoding.
  • P0.7.2’s thought_record MCP tool can use ThoughtRecord directly as its tool-result schema with no transformation.

§7. Non-goals

The following are explicitly out of scope for P0.7.1 and are reserved for later ζ tasks:

  • Persistence. No INSERT INTO thought_records. P0.7.2 owns src/domains/trail/repository.ts and 004_zeta.sql.
  • thought_record MCP tool. No registration against the server. P0.7.2.
  • audit_verify_chain MCP tool. No chain walk. P0.7.3.
  • Chain linking. No “find the latest record’s hash and use it as my prev_hash” helper. Callers supply prev_hash explicitly. P0.7.2 will add a helper.
  • ID generation. No crypto.randomUUID() call. Callers supply id. P0.7.2 will generate.
  • Timestamp generation. No new Date().toISOString(). Callers supply timestamp. P0.7.2 will generate.
  • session_id field. Donor had this (extraction line 58). Phase-0 ζ does not — P0.7.2 may re-introduce if needed, out of scope here.
  • Branching / thought trees. Donor had this (extraction lines 140-154). Phase-0 is a strict linear chain via prev_hash.

§8. Test matrix (binding for Step 3/Step 4)

The Step 3 packet will expand this into exact test names. The contract pins the minimum set of tested behaviors:

# Behavior Assertion
1 THOUGHT_TYPES tuple length 4, matches ['plan', 'analysis', 'decision', 'reflection']
2 ZERO_HASH format length === 64, matches /^0{64}$/
3 ThoughtRecordSchema accepts valid record .parse(valid) returns the record unchanged
4 ThoughtRecordSchema rejects each missing field .safeParse() returns {success: false} for each of 8 field-omission cases
5 ThoughtRecordSchema rejects invalid type .safeParse({...valid, type: 'observation'}) → failure
6 ThoughtRecordSchema rejects 63-char prev_hash / hash length-64 invariant enforced
7 ThoughtRecordSchema rejects 65-char prev_hash / hash length-64 invariant enforced
8 canonicalize primitives string, number, boolean, null each match JSON.stringify output
9 canonicalize sorts object keys canonicalize({b:1,a:2}) === '{"a":2,"b":1}'
10 canonicalize recurses into nested objects {b:{d:1,c:2},a:3} and {a:3,b:{c:2,d:1}} produce identical strings
11 canonicalize preserves array order [3,1,2]'[3,1,2]'
12 canonicalize insertion-order-agnostic two objects with same keys in different orders produce identical strings
13 canonicalize skips undefined object values {a:1,b:undefined,c:3}'{"a":1,"c":3}'
14 canonicalize throws on circular reference TypeError
15 computeHash produces 64-char lowercase hex /^[0-9a-f]{64}$/.test(out)
16 computeHash deterministic (same input twice) two calls return identical strings
17 computeHash excludes agent_id from the hash input two full records differing only in agent_id produce identical computeHash (using extra-field call)
18 computeHash sensitive to id change changing id changes the hash
19 computeHash sensitive to type change changing type changes the hash
20 computeHash sensitive to task_id change changing task_id changes the hash
21 computeHash sensitive to content change changing content changes the hash
22 computeHash sensitive to timestamp change changing timestamp changes the hash
23 computeHash sensitive to prev_hash change changing prev_hash changes the hash
24 computeHash insertion-order-agnostic passing {id,...} vs {prev_hash,...}-first yields same hash
25 computeHash first-record genesis record with prev_hash = ZERO_HASH hashes to a stable value (fixed-input snapshot)

25 tests minimum. The packet may add edge cases; it cannot remove any.


§9. Coverage target

  • src/domains/trail/schema.ts — 100% statements, 100% functions, 100% lines, ≥95% branches.
  • Small branch-coverage leeway (≥95% instead of 100%) allowed for the canonicalize type-dispatch — specifically the function/symbol-value skip which may be dead-code under TS’s strict types but is worth keeping defensively for runtime safety. Test coverage target lands at 100%-stmt/100%-func/100%-line with some uncovered defensive branches allowed.

§10. Contract acceptance

  • Purity guarantees explicit (§1).
  • Seven exports catalogued with types + invariants (§2).
  • agent_id exclusion rationale documented (§3).
  • Canonical-JSON determinism invariants enumerated (§4).
  • Error-handling contract for each function (§5).
  • Wire format compatibility asserted (§6).
  • Non-goals fenced (§7).
  • 25-case minimum test matrix (§8).
  • Coverage target set (§9).

Ready to proceed to Step 3 (packet).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.