P1.5.1 — Version Hash Computation — Behavioral Contract
Step 2 of the 5-step chain (CLAUDE.md §6).
§1. Module location
src/domains/rules/versioning.ts — pure module: no I/O (other than the SHA-256 driver, which is local to node:crypto’s Hash object — no filesystem, no network, no DB), no console, no env reads, no async, no clock. Safe to import at any layer of the κ runtime.
§2. Public surface (locked)
// Constants
export const ENGINE_VERSION: string;
export const VERSION_HASH_PREFIX: 'sha256:';
export const VERSION_HASH_HEX_LENGTH: 64;
export const VERSION_HASH_TOTAL_LENGTH: 71; // 7 ('sha256:') + 64
// Errors
export class VersionHashError extends Error;
// Core
export function computeVersionHash(
ruleset: readonly RuleNode[],
engine_version?: string,
): string;
// Verification
export function verifyRuleVersion(
expected: string,
actual: string,
): boolean;
// Internal helpers (exported only for tests)
export function stripLocations(value: unknown): unknown;
export function canonicalizeRuleset(
ruleset: readonly RuleNode[],
): string;
engine_version defaults to ENGINE_VERSION. The optional argument exists to support migration testing (Fixture 3): caller supplies a different version string and observes a different hash.
§3. Invariants
I1 — Determinism
computeVersionHash(rs, v) is a pure function. For any given (rs, v), every call returns the same string. No host-realm dependency, no platform dependency, no Node-version dependency (≥ Node 20 for node:crypto named exports).
I2 — Order independence
For any permutation rs' of rs (where rs' and rs contain the same rule set, possibly in different declaration order), computeVersionHash(rs, v) === computeVersionHash(rs', v). Achieved by sorting the input array by rule.name (ASCII codepoint) before serialization.
I3 — Location independence
The location: Location field on every AST node (parser.ts:94-99) is purely positional. Two rulesets with byte-identical rule bodies but differing source line/column numbers must produce the SAME hash. Achieved by recursively stripping the location key from every plain object in the value graph before passing to canonicalize.
I4 — Engine version sensitivity
For any ruleset rs, computeVersionHash(rs, "v1") !== computeVersionHash(rs, "v2") whenever v1 !== v2. Achieved by concatenating the engine version into the hash input AFTER the canonical ruleset bytes.
I5 — Output format
Returns a string matching the literal pattern ^sha256:[0-9a-f]{64}$. Hex is lowercase. Always exactly 71 characters total. The sha256: prefix is mandatory and not configurable — the caller must use string-prefix matching to detect the algorithm, not parse a configurable separator.
I6 — Algorithm
SHA-256 only. No MD5, no SHA-1, no SHA-3, no BLAKE. Uses crypto.createHash('sha256') from node:crypto (imported as a named import to satisfy the corpus self-scan).
I7 — Constant-time compare
verifyRuleVersion(expected, actual) runs in time independent of WHERE in the strings the first byte differs. Implementation:
- If
expected.length !== actual.length, scan the longer string AND compare to a zero baseline (so timing leaks at most the existence of a length mismatch, not the position of byte divergence). Returnfalse. - Else, fold-XOR every byte and OR-accumulate into a single
number. Returnaccumulator === 0.
This guarantees: an attacker who can measure the function’s wall time cannot learn the version byte-by-byte.
I8 — Pure on inputs
The function MUST NOT mutate ruleset, MUST NOT mutate any node in the array, MUST NOT mutate the optional engine_version parameter. stripLocations returns a fresh structure; the original AST is not touched.
I9 — Cycle detection in stripLocations
Parser AST never contains cycles by construction. However, a hostile input could carry a cycle. stripLocations walks the structure BEFORE canonicalize sees it; without local cycle detection a cyclic input would stack-overflow inside stripLocations before the canonicalizer’s own cycle guard could fire. So stripLocations carries an identity-based seen: Set<object> along the descent path and throws CanonicalSerializationError on cycle (re-used from P1.5.4 for type-uniformity at the catch site).
I10 — Empty ruleset
computeVersionHash([], v) returns a valid hash (the digest of '[]' || v). It does not throw.
§4. Error model
VersionHashError is thrown only for input shape violations:
| Trigger | Message |
|---|---|
ruleset is not an array |
ruleset must be an array of RuleNode |
engine_version is empty string |
engine_version must be a non-empty string |
engine_version is null or a non-string non-undefined value |
engine_version must be a non-empty string |
canonicalize throws (e.g. undefined / non-plain object) |
Re-thrown as CanonicalSerializationError (NOT wrapped) |
stripLocations finds a cycle |
CanonicalSerializationError('reference cycle detected', path) (raised LOCALLY in stripLocations, not by canonicalize) |
We do NOT validate that every element of ruleset is a RuleNode — that’s the parser’s + validator’s job upstream. We trust the type signature.
engine_version === undefined is not an error — it triggers the JS default-parameter mechanism and resolves to ENGINE_VERSION. This is the same as not passing the argument.
§5. Detail: stripLocations
Recursive walker over unknown:
- Atoms (number, bigint, boolean, string, null, undefined) → returned unchanged
- Array → return new array, each element recursed
- Plain object (proto === Object.prototype OR null) → return new object copying every key EXCEPT
location, recursing on values - Non-plain object (Map, Date, etc.) → returned unchanged (canonicalize will reject downstream)
Cycle detection is NOT needed here — parser AST is tree-shaped by construction. If a caller passes a cyclic graph, canonicalize rejects.
§6. Detail: canonicalizeRuleset
canonicalizeRuleset(rs) =
let stripped = stripLocations(rs) // remove `location` key
let sorted = [...stripped].sort(byRuleName) // immutable sort
return canonicalize(sorted) // P1.5.4 byte-canonical JSON
The sort runs on the stripped copy so we don’t accidentally mutate the original. byRuleName is the same ASCII codepoint comparator used by engine.ts:480 (asciiCompareByName), reproduced inline here to avoid a cross-cutting import.
§7. Detail: computeVersionHash
computeVersionHash(rs, v = ENGINE_VERSION) =
validate(rs is array, v is non-empty)
let body = canonicalizeRuleset(rs)
let hash = createHash('sha256')
hash.update(body, 'utf8')
hash.update('||', 'utf8') // separator marker
hash.update(v, 'utf8')
return 'sha256:' + hash.digest('hex')
The || separator is documented in rule-engine.md §Rule versioning (“canonical_serialization(all_rules) || engine_version”). It’s literal '||' — two ASCII vertical-bar characters. This makes truncation attacks impossible: a crafted ruleset whose canonical body ends with ||v1 cannot collide with a different body + ||v2-suffix because the separator is always present at the boundary.
§8. Detail: verifyRuleVersion
verifyRuleVersion(expected, actual) =
let expLen = expected.length
let actLen = actual.length
let scanLen = max(expLen, actLen)
let acc = (expLen ^ actLen) // 0 only if same length
for i in 0..scanLen:
let e = expected.charCodeAt(i % expLen) // safe — at least 1 char
let a = actual.charCodeAt(i % actLen)
acc |= (e ^ a)
return acc === 0
For empty inputs (length 0), we explicitly short-circuit expected === actual since i % 0 is NaN. Empty inputs are not realistic but we handle them defensively.
The % length trick keeps the loop bound at max(expLen, actLen) (not the smaller of the two) — so a length mismatch still scans the longer side, leaking only the length, not the position. This satisfies I7 in the strict sense documented above.
§9. Acceptance traceback
| ID | Statement | Source |
|---|---|---|
| AC1 | computeVersionHash(ruleset, engine_version): string returns hex SHA-256 |
task-breakdown.md §P1.5.1 |
| AC2 | Input fed to hash: canonical_serialization(all_rules) || engine_version |
task-breakdown.md §P1.5.1 |
| AC3 | Output format: sha256:<hex> |
task-prompt §P1.5.1 |
| AC4 | Two logically-equivalent but differently-ordered rulesets produce identical hash | task-prompt §P1.5.1 fixture 1 |
| AC5 | Adding one character to a rule body changes the hash | task-prompt §P1.5.1 fixture 2 |
| AC6 | engine_version change with same ruleset → different hash | task-prompt §P1.5.1 fixture 3 |
| AC7 | verifyRuleVersion constant-time |
task-prompt §P1.5.1 fixture 4 |
| AC8 | SHA-256 only — no MD5 / SHA-1 | task-prompt forbiddens |
| AC9 | Pass corpus self-scan (no crypto.* token after stripping comments) |
determinism.test.ts §Group 12 |
| AC10 | npm run build && npm run lint && npm test all green |
CLAUDE.md §5 |
Fixture 5 from the task prompt (“patched registry.computeVersionHash returns the SHA-256 form”) is OUT OF SCOPE for this slice. Registry is P1.2.4 sibling — it imports our computeVersionHash directly.
§10. Forbidden actions
- ✗ Use
Math.*,Date.*, timers, network, filesystem - ✗ Reference
crypto.<anything>in source (use named import{ createHash }) - ✗ Use
JSON.stringifyfor canonical encoding (usecanonicalizefrom P1.5.4) - ✗ Use
localeCompare/Intl.Collatorfor the rule sort - ✗ Edit
canonical.ts,registry.ts, or any file outsidesrc/domains/rules/versioning.tsand its tests - ✗ Skip the
||separator in the hash input - ✗ Drop the
sha256:prefix - ✗ Use any hash other than SHA-256
- ✗ Add cycle detection (canonicalize handles it)
- ✗ Pre-validate that every element is
RuleNode(validator’s job)
§11. Test plan summary (full plan in packet §3)
10 test groups planned:
- G1 — output format (length, prefix, hex)
- G2 — order independence (Fixture 1)
- G3 — content sensitivity (Fixture 2)
- G4 — engine version sensitivity (Fixture 3)
- G5 — location independence
- G6 — empty ruleset handling
- G7 — constant-time compare correctness (Fixture 4)
- G8 —
verifyRuleVersionlength-mismatch / equal / unequal cases - G9 — error model (input shape, canonicalize rethrow)
- G10 —
stripLocations/canonicalizeRulesethelpers
Coverage target: ≥ 95% lines, ≥ 90% branches on versioning.ts.