P1.5.1 — Version Hash Computation — Behavioral Contract

Step 2 of the 5-step chain (CLAUDE.md §6).

§1. Module location

src/domains/rules/versioning.ts — pure module: no I/O (other than the SHA-256 driver, which is local to node:crypto’s Hash object — no filesystem, no network, no DB), no console, no env reads, no async, no clock. Safe to import at any layer of the κ runtime.

§2. Public surface (locked)

// Constants
export const ENGINE_VERSION: string;
export const VERSION_HASH_PREFIX: 'sha256:';
export const VERSION_HASH_HEX_LENGTH: 64;
export const VERSION_HASH_TOTAL_LENGTH: 71;  // 7 ('sha256:') + 64

// Errors
export class VersionHashError extends Error;

// Core
export function computeVersionHash(
  ruleset: readonly RuleNode[],
  engine_version?: string,
): string;

// Verification
export function verifyRuleVersion(
  expected: string,
  actual: string,
): boolean;

// Internal helpers (exported only for tests)
export function stripLocations(value: unknown): unknown;
export function canonicalizeRuleset(
  ruleset: readonly RuleNode[],
): string;

engine_version defaults to ENGINE_VERSION. The optional argument exists to support migration testing (Fixture 3): caller supplies a different version string and observes a different hash.

§3. Invariants

I1 — Determinism

computeVersionHash(rs, v) is a pure function. For any given (rs, v), every call returns the same string. No host-realm dependency, no platform dependency, no Node-version dependency (≥ Node 20 for node:crypto named exports).

I2 — Order independence

For any permutation rs' of rs (where rs' and rs contain the same rule set, possibly in different declaration order), computeVersionHash(rs, v) === computeVersionHash(rs', v). Achieved by sorting the input array by rule.name (ASCII codepoint) before serialization.

I3 — Location independence

The location: Location field on every AST node (parser.ts:94-99) is purely positional. Two rulesets with byte-identical rule bodies but differing source line/column numbers must produce the SAME hash. Achieved by recursively stripping the location key from every plain object in the value graph before passing to canonicalize.

I4 — Engine version sensitivity

For any ruleset rs, computeVersionHash(rs, "v1") !== computeVersionHash(rs, "v2") whenever v1 !== v2. Achieved by concatenating the engine version into the hash input AFTER the canonical ruleset bytes.

I5 — Output format

Returns a string matching the literal pattern ^sha256:[0-9a-f]{64}$. Hex is lowercase. Always exactly 71 characters total. The sha256: prefix is mandatory and not configurable — the caller must use string-prefix matching to detect the algorithm, not parse a configurable separator.

I6 — Algorithm

SHA-256 only. No MD5, no SHA-1, no SHA-3, no BLAKE. Uses crypto.createHash('sha256') from node:crypto (imported as a named import to satisfy the corpus self-scan).

I7 — Constant-time compare

verifyRuleVersion(expected, actual) runs in time independent of WHERE in the strings the first byte differs. Implementation:

  1. If expected.length !== actual.length, scan the longer string AND compare to a zero baseline (so timing leaks at most the existence of a length mismatch, not the position of byte divergence). Return false.
  2. Else, fold-XOR every byte and OR-accumulate into a single number. Return accumulator === 0.

This guarantees: an attacker who can measure the function’s wall time cannot learn the version byte-by-byte.

I8 — Pure on inputs

The function MUST NOT mutate ruleset, MUST NOT mutate any node in the array, MUST NOT mutate the optional engine_version parameter. stripLocations returns a fresh structure; the original AST is not touched.

I9 — Cycle detection in stripLocations

Parser AST never contains cycles by construction. However, a hostile input could carry a cycle. stripLocations walks the structure BEFORE canonicalize sees it; without local cycle detection a cyclic input would stack-overflow inside stripLocations before the canonicalizer’s own cycle guard could fire. So stripLocations carries an identity-based seen: Set<object> along the descent path and throws CanonicalSerializationError on cycle (re-used from P1.5.4 for type-uniformity at the catch site).

I10 — Empty ruleset

computeVersionHash([], v) returns a valid hash (the digest of '[]' || v). It does not throw.

§4. Error model

VersionHashError is thrown only for input shape violations:

Trigger Message
ruleset is not an array ruleset must be an array of RuleNode
engine_version is empty string engine_version must be a non-empty string
engine_version is null or a non-string non-undefined value engine_version must be a non-empty string
canonicalize throws (e.g. undefined / non-plain object) Re-thrown as CanonicalSerializationError (NOT wrapped)
stripLocations finds a cycle CanonicalSerializationError('reference cycle detected', path) (raised LOCALLY in stripLocations, not by canonicalize)

We do NOT validate that every element of ruleset is a RuleNode — that’s the parser’s + validator’s job upstream. We trust the type signature.

engine_version === undefined is not an error — it triggers the JS default-parameter mechanism and resolves to ENGINE_VERSION. This is the same as not passing the argument.

§5. Detail: stripLocations

Recursive walker over unknown:

  • Atoms (number, bigint, boolean, string, null, undefined) → returned unchanged
  • Array → return new array, each element recursed
  • Plain object (proto === Object.prototype OR null) → return new object copying every key EXCEPT location, recursing on values
  • Non-plain object (Map, Date, etc.) → returned unchanged (canonicalize will reject downstream)

Cycle detection is NOT needed here — parser AST is tree-shaped by construction. If a caller passes a cyclic graph, canonicalize rejects.

§6. Detail: canonicalizeRuleset

canonicalizeRuleset(rs) =
  let stripped = stripLocations(rs)               // remove `location` key
  let sorted = [...stripped].sort(byRuleName)     // immutable sort
  return canonicalize(sorted)                     // P1.5.4 byte-canonical JSON

The sort runs on the stripped copy so we don’t accidentally mutate the original. byRuleName is the same ASCII codepoint comparator used by engine.ts:480 (asciiCompareByName), reproduced inline here to avoid a cross-cutting import.

§7. Detail: computeVersionHash

computeVersionHash(rs, v = ENGINE_VERSION) =
  validate(rs is array, v is non-empty)
  let body = canonicalizeRuleset(rs)
  let hash = createHash('sha256')
  hash.update(body, 'utf8')
  hash.update('||', 'utf8')                       // separator marker
  hash.update(v, 'utf8')
  return 'sha256:' + hash.digest('hex')

The || separator is documented in rule-engine.md §Rule versioning (“canonical_serialization(all_rules) || engine_version”). It’s literal '||' — two ASCII vertical-bar characters. This makes truncation attacks impossible: a crafted ruleset whose canonical body ends with ||v1 cannot collide with a different body + ||v2-suffix because the separator is always present at the boundary.

§8. Detail: verifyRuleVersion

verifyRuleVersion(expected, actual) =
  let expLen = expected.length
  let actLen = actual.length
  let scanLen = max(expLen, actLen)
  let acc = (expLen ^ actLen)        // 0 only if same length
  for i in 0..scanLen:
    let e = expected.charCodeAt(i % expLen)   // safe — at least 1 char
    let a = actual.charCodeAt(i % actLen)
    acc |= (e ^ a)
  return acc === 0

For empty inputs (length 0), we explicitly short-circuit expected === actual since i % 0 is NaN. Empty inputs are not realistic but we handle them defensively.

The % length trick keeps the loop bound at max(expLen, actLen) (not the smaller of the two) — so a length mismatch still scans the longer side, leaking only the length, not the position. This satisfies I7 in the strict sense documented above.

§9. Acceptance traceback

ID Statement Source
AC1 computeVersionHash(ruleset, engine_version): string returns hex SHA-256 task-breakdown.md §P1.5.1
AC2 Input fed to hash: canonical_serialization(all_rules) || engine_version task-breakdown.md §P1.5.1
AC3 Output format: sha256:<hex> task-prompt §P1.5.1
AC4 Two logically-equivalent but differently-ordered rulesets produce identical hash task-prompt §P1.5.1 fixture 1
AC5 Adding one character to a rule body changes the hash task-prompt §P1.5.1 fixture 2
AC6 engine_version change with same ruleset → different hash task-prompt §P1.5.1 fixture 3
AC7 verifyRuleVersion constant-time task-prompt §P1.5.1 fixture 4
AC8 SHA-256 only — no MD5 / SHA-1 task-prompt forbiddens
AC9 Pass corpus self-scan (no crypto.* token after stripping comments) determinism.test.ts §Group 12
AC10 npm run build && npm run lint && npm test all green CLAUDE.md §5

Fixture 5 from the task prompt (“patched registry.computeVersionHash returns the SHA-256 form”) is OUT OF SCOPE for this slice. Registry is P1.2.4 sibling — it imports our computeVersionHash directly.

§10. Forbidden actions

  • ✗ Use Math.*, Date.*, timers, network, filesystem
  • ✗ Reference crypto.<anything> in source (use named import { createHash })
  • ✗ Use JSON.stringify for canonical encoding (use canonicalize from P1.5.4)
  • ✗ Use localeCompare / Intl.Collator for the rule sort
  • ✗ Edit canonical.ts, registry.ts, or any file outside src/domains/rules/versioning.ts and its tests
  • ✗ Skip the || separator in the hash input
  • ✗ Drop the sha256: prefix
  • ✗ Use any hash other than SHA-256
  • ✗ Add cycle detection (canonicalize handles it)
  • ✗ Pre-validate that every element is RuleNode (validator’s job)

§11. Test plan summary (full plan in packet §3)

10 test groups planned:

  • G1 — output format (length, prefix, hex)
  • G2 — order independence (Fixture 1)
  • G3 — content sensitivity (Fixture 2)
  • G4 — engine version sensitivity (Fixture 3)
  • G5 — location independence
  • G6 — empty ruleset handling
  • G7 — constant-time compare correctness (Fixture 4)
  • G8 — verifyRuleVersion length-mismatch / equal / unequal cases
  • G9 — error model (input shape, canonicalize rethrow)
  • G10 — stripLocations / canonicalizeRuleset helpers

Coverage target: ≥ 95% lines, ≥ 90% branches on versioning.ts.



Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.