Contract: P0.8.1 η Merkle Tree Construction

Task: P0.8.1 — first η (Proof Store) surface Branch: feature/p0-8-1-merkle-tree Date: 2026-04-17 Status: APPROVED (self-gated per packet gate rule)

1. Purpose

Define the behavioral contract for the three public exports of src/domains/proof/merkle.ts:

buildMerkleTree(recordHashes) — construct a deterministic Merkle tree
generateProof(tree, leafHash) — produce a membership proof for a leaf
verifyProof(root, proof, leafHash) — verify a membership proof
EMPTY_TREE_ROOT — constant for the empty-tree case

These are pure functions. No side effects. No module-level state. No DB access.

2. Type signatures

import { MerkleTree } from 'merkletreejs';

/** Exported type alias for merkletreejs proof element shape. */
export type MerkleProofNode = { position: 'left' | 'right'; data: Buffer };

/** Full proof: ordered array of sibling nodes along the path to root. */
export type MerkleProof = MerkleProofNode[];

/** Return type of buildMerkleTree. */
export interface MerkleTreeResult {
  root: string;    // 64-char lowercase hex SHA-256
  tree: MerkleTree;
}

2.1 `EMPTY_TREE_ROOT`

export const EMPTY_TREE_ROOT: string;

Value: 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' (SHA-256 of the empty string '', lowercase hex, 64 characters)

Rationale: When a caller provides zero hashes, the tree’s root is defined as sha256('') rather than merkletreejs’s default (empty Buffer). This gives callers a deterministic, non-empty 64-char string that is consistent with the SHA-256 convention used throughout Colibri (ζ uses 64-char hex hashes). The choice of sha256('') is documented here as the canonical empty-tree sentinel.

2.2 `buildMerkleTree`

export function buildMerkleTree(recordHashes: string[]): MerkleTreeResult

Pre-conditions:

recordHashes is an array of strings. Each element SHOULD be a 64-char lowercase hex string (SHA-256 digest). The function does NOT validate individual leaf format — callers are responsible for passing valid hex. Non-hex strings will produce a tree but generateProof/verifyProof results will be meaningless.

Post-conditions:

Returns { root, tree }.
If recordHashes.length === 0: root === EMPTY_TREE_ROOT.
If recordHashes.length > 0:
- root is a 64-char lowercase hex string.
- tree is a MerkleTree instance constructed with sortPairs: true.
- Calling buildMerkleTree twice with the same set of hashes (regardless of input array order) produces identical root values — this is the determinism guarantee. Guaranteed by merkletreejs’s sortPairs: true option.
The tree object is NOT cached. Each call produces a fresh MerkleTree instance.

Throws: Does not throw under normal operation. If Buffer.from(h, 'hex') fails (malformed input), Node.js will not throw — it will produce a Buffer from whatever bytes it can parse, which may cause an incorrect but non-crashing result. Callers must validate inputs before calling.

Internal construction:

const sha256 = (data: Buffer): Buffer =>
  createHash('sha256').update(data).digest();

// Both options required for full order-independence:
// sortLeaves: true — sorts leaf Buffers before construction
// sortPairs: true  — sorts siblings before each internal hash
const TREE_OPTIONS = { sortPairs: true, sortLeaves: true };

const leaves = recordHashes.map(h => Buffer.from(h, 'hex'));
const tree = new MerkleTree(leaves, sha256, TREE_OPTIONS);
const root = recordHashes.length === 0
  ? EMPTY_TREE_ROOT
  : tree.getRoot().toString('hex');

Note on sortPairs vs sortLeaves: sortPairs: true alone sorts sibling pairs at each internal level but does NOT sort the initial leaf order — so two calls with the same hashes in different orders produce different roots unless sortLeaves: true is ALSO set. Both options are required for the determinism guarantee. The same TREE_OPTIONS must be passed to MerkleTree.verify.

2.3 `generateProof`

export function generateProof(tree: MerkleTree, leafHash: string): MerkleProof

Pre-conditions:

tree is a MerkleTree instance from buildMerkleTree (must be non-empty).
leafHash is the same 64-char hex string that was passed into buildMerkleTree.

Post-conditions:

Returns an array of { position: 'left' | 'right', data: Buffer } nodes — the standard merkletreejs proof shape returned by tree.getProof(leaf).
If leafHash is NOT a leaf of the tree, returns an empty array [].
The proof is minimal: length === Math.ceil(Math.log2(n)) for a balanced tree with n leaves (approximately).
The returned objects are direct references from merkletreejs. Do NOT mutate them.

Proof shape rationale: We use merkletreejs’s native proof shape ({ position, data }) rather than a normalized string shape. This avoids a re-encoding round-trip and keeps the proof usable directly by verifyProof and by merkletreejs’s own tree.verify. Future P0.8.3 (merkle_finalize) may serialize proofs to JSON for storage; that serialization is a P0.8.3 concern.

2.4 `verifyProof`

export function verifyProof(
  root: string,
  proof: MerkleProof,
  leafHash: string
): boolean

Pre-conditions:

root — 64-char lowercase hex string (the root returned by buildMerkleTree).
proof — array of MerkleProofNode (from generateProof).
leafHash — 64-char lowercase hex string (must match one of the original leaves).

Post-conditions:

Returns true if and only if the proof is a valid Merkle path from leafHash to root.
Returns false for:
- Empty proof array with non-matching leaf/root
- Tampered proof nodes
- Wrong leafHash / root combination
- A valid proof from a different tree (different root)
Deterministic: same inputs → same output, always.

Implementation:

const leaf = Buffer.from(leafHash, 'hex');
const rootBuf = Buffer.from(root, 'hex');
return MerkleTree.verify(proof, leaf, rootBuf, sha256, { sortPairs: true });

Using the static MerkleTree.verify (or equivalent instance method) is preferred over re-constructing a tree. Must pass the same hash function and options as used in construction to get a correct result.

3. Module-level purity guarantees

No module-level state: no cached trees, no singletons.
No side effects at import time: EMPTY_TREE_ROOT is computed once at module initialization using a single createHash call, which is a pure operation.
No DB access, no env reads, no console output.
Every call to buildMerkleTree creates a new MerkleTree instance.

4. Export manifest

// Named exports only — no default export.
export const EMPTY_TREE_ROOT: string;
export type MerkleProofNode = { position: 'left' | 'right'; data: Buffer };
export type MerkleProof = MerkleProofNode[];
export interface MerkleTreeResult { root: string; tree: MerkleTree; }
export function buildMerkleTree(recordHashes: string[]): MerkleTreeResult;
export function generateProof(tree: MerkleTree, leafHash: string): MerkleProof;
export function verifyProof(root: string, proof: MerkleProof, leafHash: string): boolean;

5. Acceptance criteria (per task spec)

#	Criterion	Contract section
1	Uses `merkletreejs` with SHA-256 leaves	§2.2 construction block
2	`buildMerkleTree(hashes)` → `{ root, tree }` — root is hex	§2.2
3	`generateProof(tree, leafHash)` → `MerkleProof` (`{position, data}[]`)	§2.3
4	`verifyProof(root, proof, leafHash)` → boolean	§2.4
5	Empty tree: `root === EMPTY_TREE_ROOT === sha256('')`	§2.1 + §2.2
6	10-leaf determinism; valid proof verifies; invalid rejected; empty-tree tested	§2.2 + §2.3 + §2.4

6. Out of scope (deferred to future tasks)

Serialization of MerkleTree instances or proofs to JSON/DB (P0.8.2/P0.8.3)
The merkle_finalize MCP tool (P0.8.3)
Retention policy for old proofs (P0.8.2)
Anchoring ζ trail records to the Merkle root (P0.7.3 / P0.8.3)

Contract: P0.8.1 η Merkle Tree Construction

1. Purpose

2. Type signatures

2.1 EMPTY_TREE_ROOT

2.2 buildMerkleTree

2.3 generateProof

2.4 verifyProof

3. Module-level purity guarantees

4. Export manifest

5. Acceptance criteria (per task spec)

6. Out of scope (deferred to future tasks)

2.1 `EMPTY_TREE_ROOT`

2.2 `buildMerkleTree`

2.3 `generateProof`

2.4 `verifyProof`