P1.5.1 — Version Hash Computation — Audit

Step 1 of the 5-step chain (CLAUDE.md §6).

§1. Surface inventory at base SHA d766db59

Path Exists? Role
src/domains/rules/versioning.ts No — to create The version-hash module (this task)
src/__tests__/domains/rules/versioning.test.ts No — to create Test suite
src/domains/rules/canonical.ts Yes (P1.5.4 #206) Source of canonicalize(value): string — required input to the SHA-256 hash
src/domains/rules/parser.ts Yes (P1.2.2 #205) Source of RuleNode + 11 AST node types
src/domains/rules/registry.ts No — sibling P1.2.4 in flight This task does NOT touch it; P1.2.4 will import from versioning.ts
src/domains/rules/engine.ts Yes (P1.3.1 #208) Reference pattern for asciiCompareByName (we mirror this exact comparator semantically)
src/domains/rules/determinism.ts Yes (R83.A #190) Defines forbidden-op manifest + self-scan test
src/__tests__/domains/rules/determinism.test.ts §Group 12 Yes Self-scan that scans every *.ts in src/domains/rules/ (except determinism.ts) — versioning.ts MUST pass

§2. Spec sources

  • docs/3-world/physics/laws/rule-engine.md §Rule versioning — defines rule_version_hash = SHA-256 over canonical(rule bodies) || engine_version and its load-bearing roles in θ consensus signing + ι fork ids.
  • docs/reference/extractions/kappa-rule-engine-extraction.md §1 (EBNF) + §2 (11-node AST) — the AST shape the canonical serializer encodes.
  • docs/guides/implementation/task-prompts/p1.1-kappa-rule-engine.md §P1.5.1 — task prompt, lists 5 test fixtures.
  • docs/guides/implementation/task-breakdown.md §P1.5.1 — acceptance criteria.
  • docs/contracts/p1-5-4-canonical-contract.md — the immediately upstream contract; I rely on its determinism + sort guarantees.

§3. Determinism corpus self-scan constraints

The corpus self-scan at src/__tests__/domains/rules/determinism.test.ts:833-889 will scan versioning.ts post-strip-comments against this fixed manifest:

\bMath\.[A-Za-z_]\w*           — Math.*
\bDate\.[A-Za-z_]\w*           — Date.*
\bnew\s+Date\b                 — new Date
\b(?:setTimeout|...|setImmediate)\b — timer
\b(?:fetch|XMLHttpRequest)\b   — network
\brequire\s*\(\s*['"](?:fs|node:fs)['"]  — require(fs)
\bfrom\s+['"](?:fs|node:fs)['"]          — import fs
\bcrypto\.[A-Za-z_]\w*         — crypto.*    ← LOAD-BEARING for this task
\bprocess\.(?:hrtime|nextTick)\b
\bawait\b
\basync\s+(?:function|\()
(?<![0-9n])\b\d+\.\d+\b        — float literal

3.1 The crypto.* trap

The acceptance criteria require SHA-256, which MUST come from node:crypto. The regex \bcrypto\.[A-Za-z_]\w* is keyed on a literal crypto. token followed by an identifier. The escape hatch: import createHash as a NAMED IMPORT and never write crypto.<anything> in the source.

import { createHash } from 'node:crypto';   // ✓ no crypto.<x> token
const hash = createHash('sha256');           // ✓ no crypto.<x> token
hash.update(...).digest('hex');              // ✓ no crypto.<x> token

The import fs regex is keyed on fs|node:fs, NOT crypto|node:crypto, so the import statement passes. Comments are stripped before scanning, so JSDoc may reference crypto.createHash freely.

3.2 Other tokens

  • No timers, no float literals (we don’t multiply or divide; no need for floats anywhere).
  • No await / async (the SHA-256 driver is a sync method on Hash).
  • No Math.* / Date.*.

§4. Type shape we encode

A RuleNode (from parser.ts:102-108) has type, location, name, guards: GuardClause[], effects: EffectCall[]. The recursive expression tree under guards/effects can contain IntLiteral.value: bigint — this is the load-bearing case that justifies importing canonicalize (which handles bigint).

Sub-node Location (parser.ts:94-99) is { startLine, startColumn, endLine, endColumn } — purely positional, NOT semantic. Two textually identical rules at different file positions must hash the same. We strip location recursively before passing to canonicalize.

§5. Order-independence requirement

The acceptance test (Fixture 1 in the task prompt): two rulesets with the same rules in different declaration order must produce IDENTICAL hashes. The engine.ts §5 uses asciiCompareByName to sort rules within a category. We use the same comparator on the full ruleset (no category grouping at the hash layer — categories are a runtime sort, not a hash-level concern).

5.1 Specificity vs. name sort

The spec at rule-engine.md §Specificity ordering says runtime sort is “(a) guard term count desc, (b) declaration order asc”. But declaration order is unstable across input variants — exactly what we want the hash to NORMALIZE. So at the hash layer we sort strictly by rule.name (ASCII codepoint). Rule names are required-unique by P1.2.4 registry validation; declaration order then becomes irrelevant for the hash input.

§6. Engine version constant

ENGINE_VERSION = 'kappa-engine/1.0.0' — the initial Phase 1 release. Must bump on any semantic change to the engine binary (not the rules themselves). Listed in the task prompt’s “Common gotchas” §1.

§7. Output format

'sha256:<hex64>' — 71 characters total (7-char prefix + 64 hex). Matches the spec’s load-bearing requirement (the prefix lets future hash algorithm migrations coexist without ambiguity, per task prompt’s “Common gotchas” §2).

§8. Constant-time comparison

verifyRuleVersion(expected, actual) does a constant-time compare so a timing oracle cannot leak the version byte-by-byte. Implementation: XOR every byte of expected and actual, OR-accumulate, then compare to 0. Length mismatch returns false but still scans the longer string to keep timing constant relative to the longer input.

The ECMA-262 spec does not guarantee constant-time string comparison, so === is unsafe here.

§9. Failure modes to handle

  • Empty ruleset → still produce a deterministic hash (digest of '[]'   engine_version).
  • Non-RuleNode passed in → caller’s responsibility; we type-narrow but don’t pre-validate (validator is P1.2.3’s job).
  • Engine version with non-ASCII → canonicalize-via-encodeString WOULD work, but we keep the engine version as a literal string passthrough since it’s an internal constant we control.

§10. Forbidden surface (per task prompt + CLAUDE.md §3)

  • ✗ MD5, SHA-1, or any non-SHA-256 hash
  • ✗ Skipping the engine_version concatenation
  • ✗ Returning raw bytes (must be hex with sha256: prefix)
  • ✗ Editing canonical.ts or registry.ts (P1.5.4 / P1.2.4 sibling files)
  • ✗ Editing the main checkout
  • localeCompare / Intl.Collator for the rule sort

§11. Risk register

Risk Mitigation
crypto.* regex tripping the corpus self-scan Named import { createHash }, never crypto.<x> token in source
location field included in hash → identical rules at different positions hash differently Strip location recursively before canonicalize
Declaration order changes hash → fixture 1 fails Sort ruleset by rule.name (ASCII codepoint) before hashing
bigint IntLiteral values not representable canonicalize handles bigint per P1.5.4 contract I4
Constant-time compare not guaranteed by === Explicit byte-XOR loop; length-mismatch path scans longer side
noUncheckedIndexedAccess breaking string indexing Use charCodeAt (returns number | NaN, never undefined)


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.