Contract: P0.8.1 η Merkle Tree Construction

Task: P0.8.1 — first η (Proof Store) surface Branch: feature/p0-8-1-merkle-tree Date: 2026-04-17 Status: APPROVED (self-gated per packet gate rule)


1. Purpose

Define the behavioral contract for the three public exports of src/domains/proof/merkle.ts:

  1. buildMerkleTree(recordHashes) — construct a deterministic Merkle tree
  2. generateProof(tree, leafHash) — produce a membership proof for a leaf
  3. verifyProof(root, proof, leafHash) — verify a membership proof
  4. EMPTY_TREE_ROOT — constant for the empty-tree case

These are pure functions. No side effects. No module-level state. No DB access.


2. Type signatures

import { MerkleTree } from 'merkletreejs';

/** Exported type alias for merkletreejs proof element shape. */
export type MerkleProofNode = { position: 'left' | 'right'; data: Buffer };

/** Full proof: ordered array of sibling nodes along the path to root. */
export type MerkleProof = MerkleProofNode[];

/** Return type of buildMerkleTree. */
export interface MerkleTreeResult {
  root: string;    // 64-char lowercase hex SHA-256
  tree: MerkleTree;
}

2.1 EMPTY_TREE_ROOT

export const EMPTY_TREE_ROOT: string;

Value: 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855' (SHA-256 of the empty string '', lowercase hex, 64 characters)

Rationale: When a caller provides zero hashes, the tree’s root is defined as sha256('') rather than merkletreejs’s default (empty Buffer). This gives callers a deterministic, non-empty 64-char string that is consistent with the SHA-256 convention used throughout Colibri (ζ uses 64-char hex hashes). The choice of sha256('') is documented here as the canonical empty-tree sentinel.


2.2 buildMerkleTree

export function buildMerkleTree(recordHashes: string[]): MerkleTreeResult

Pre-conditions:

  • recordHashes is an array of strings. Each element SHOULD be a 64-char lowercase hex string (SHA-256 digest). The function does NOT validate individual leaf format — callers are responsible for passing valid hex. Non-hex strings will produce a tree but generateProof/verifyProof results will be meaningless.

Post-conditions:

  • Returns { root, tree }.
  • If recordHashes.length === 0: root === EMPTY_TREE_ROOT.
  • If recordHashes.length > 0:
    • root is a 64-char lowercase hex string.
    • tree is a MerkleTree instance constructed with sortPairs: true.
    • Calling buildMerkleTree twice with the same set of hashes (regardless of input array order) produces identical root values — this is the determinism guarantee. Guaranteed by merkletreejs’s sortPairs: true option.
  • The tree object is NOT cached. Each call produces a fresh MerkleTree instance.

Throws: Does not throw under normal operation. If Buffer.from(h, 'hex') fails (malformed input), Node.js will not throw — it will produce a Buffer from whatever bytes it can parse, which may cause an incorrect but non-crashing result. Callers must validate inputs before calling.

Internal construction:

const sha256 = (data: Buffer): Buffer =>
  createHash('sha256').update(data).digest();

// Both options required for full order-independence:
// sortLeaves: true — sorts leaf Buffers before construction
// sortPairs: true  — sorts siblings before each internal hash
const TREE_OPTIONS = { sortPairs: true, sortLeaves: true };

const leaves = recordHashes.map(h => Buffer.from(h, 'hex'));
const tree = new MerkleTree(leaves, sha256, TREE_OPTIONS);
const root = recordHashes.length === 0
  ? EMPTY_TREE_ROOT
  : tree.getRoot().toString('hex');

Note on sortPairs vs sortLeaves: sortPairs: true alone sorts sibling pairs at each internal level but does NOT sort the initial leaf order — so two calls with the same hashes in different orders produce different roots unless sortLeaves: true is ALSO set. Both options are required for the determinism guarantee. The same TREE_OPTIONS must be passed to MerkleTree.verify.


2.3 generateProof

export function generateProof(tree: MerkleTree, leafHash: string): MerkleProof

Pre-conditions:

  • tree is a MerkleTree instance from buildMerkleTree (must be non-empty).
  • leafHash is the same 64-char hex string that was passed into buildMerkleTree.

Post-conditions:

  • Returns an array of { position: 'left' | 'right', data: Buffer } nodes — the standard merkletreejs proof shape returned by tree.getProof(leaf).
  • If leafHash is NOT a leaf of the tree, returns an empty array [].
  • The proof is minimal: length === Math.ceil(Math.log2(n)) for a balanced tree with n leaves (approximately).
  • The returned objects are direct references from merkletreejs. Do NOT mutate them.

Proof shape rationale: We use merkletreejs’s native proof shape ({ position, data }) rather than a normalized string shape. This avoids a re-encoding round-trip and keeps the proof usable directly by verifyProof and by merkletreejs’s own tree.verify. Future P0.8.3 (merkle_finalize) may serialize proofs to JSON for storage; that serialization is a P0.8.3 concern.


2.4 verifyProof

export function verifyProof(
  root: string,
  proof: MerkleProof,
  leafHash: string
): boolean

Pre-conditions:

  • root — 64-char lowercase hex string (the root returned by buildMerkleTree).
  • proof — array of MerkleProofNode (from generateProof).
  • leafHash — 64-char lowercase hex string (must match one of the original leaves).

Post-conditions:

  • Returns true if and only if the proof is a valid Merkle path from leafHash to root.
  • Returns false for:
    • Empty proof array with non-matching leaf/root
    • Tampered proof nodes
    • Wrong leafHash / root combination
    • A valid proof from a different tree (different root)
  • Deterministic: same inputs → same output, always.

Implementation:

const leaf = Buffer.from(leafHash, 'hex');
const rootBuf = Buffer.from(root, 'hex');
return MerkleTree.verify(proof, leaf, rootBuf, sha256, { sortPairs: true });

Using the static MerkleTree.verify (or equivalent instance method) is preferred over re-constructing a tree. Must pass the same hash function and options as used in construction to get a correct result.


3. Module-level purity guarantees

  • No module-level state: no cached trees, no singletons.
  • No side effects at import time: EMPTY_TREE_ROOT is computed once at module initialization using a single createHash call, which is a pure operation.
  • No DB access, no env reads, no console output.
  • Every call to buildMerkleTree creates a new MerkleTree instance.

4. Export manifest

// Named exports only — no default export.
export const EMPTY_TREE_ROOT: string;
export type MerkleProofNode = { position: 'left' | 'right'; data: Buffer };
export type MerkleProof = MerkleProofNode[];
export interface MerkleTreeResult { root: string; tree: MerkleTree; }
export function buildMerkleTree(recordHashes: string[]): MerkleTreeResult;
export function generateProof(tree: MerkleTree, leafHash: string): MerkleProof;
export function verifyProof(root: string, proof: MerkleProof, leafHash: string): boolean;

5. Acceptance criteria (per task spec)

# Criterion Contract section
1 Uses merkletreejs with SHA-256 leaves §2.2 construction block
2 buildMerkleTree(hashes){ root, tree } — root is hex §2.2
3 generateProof(tree, leafHash)MerkleProof ({position, data}[]) §2.3
4 verifyProof(root, proof, leafHash) → boolean §2.4
5 Empty tree: root === EMPTY_TREE_ROOT === sha256('') §2.1 + §2.2
6 10-leaf determinism; valid proof verifies; invalid rejected; empty-tree tested §2.2 + §2.3 + §2.4

6. Out of scope (deferred to future tasks)

  • Serialization of MerkleTree instances or proofs to JSON/DB (P0.8.2/P0.8.3)
  • The merkle_finalize MCP tool (P0.8.3)
  • Retention policy for old proofs (P0.8.2)
  • Anchoring ζ trail records to the Merkle root (P0.7.3 / P0.8.3)

Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.