P1.3.1 — κ Core Evaluation Loop — Behavioral Contract

Step 2 of the 5-step executor chain. Builds on docs/audits/p1-3-1-engine-audit.md. Defines the public surface, semantics, and invariants for src/domains/rules/engine.ts.

§1. Module identity

  • Path: src/domains/rules/engine.ts
  • Axis: κ — Rule Engine (Phase 1 Wave 4)
  • Kind: pure synchronous module; no I/O, no DB access, no network, no env reads, no console output.
  • Internal dependencies:
    • ./parser.js — type-only re-imports of AST node interfaces (RuleNode, Expression, etc.).
    • ./integer-math.jssafe_mul, safe_div, OverflowError, DivisionByZeroError.
  • No imports from src/db/*, src/middleware/*, src/server.ts, or other domain folders. No Node built-ins.

§2. Public API

The module exports the following named entities:

§2.1. Constants

export const MAX_INTEGER_OPS = 10_000;  // total ops per rule (guard + effects)
export const MAX_CALL_DEPTH = 16;       // nested FuncCall frames
export const MAX_ARG_COUNT = 8;         // arity of any single FuncCall

These match the concept doc docs/3-world/physics/laws/rule-engine.md §Default budget constants.

§2.2. Category enum

export type Category =
  | 'Admission'
  | 'StateTransition'
  | 'Consequence'
  | 'Promotion';

export const CATEGORY_ORDER: readonly Category[] = [
  'Admission',
  'StateTransition',
  'Consequence',
  'Promotion',
] as const;

CATEGORY_ORDER is the deterministic execution order — load-bearing for θ consensus.

§2.3. Mutation

export interface Mutation {
  kind: 'set' | 'emit' | 'apply';
  target: string;          // dotted path or external sink name
  field: string;           // field name on the target
  old_value?: unknown;     // prior value if known (P1.4.1 fills this in)
  new_value: unknown;      // value after the mutation lands
}

A Mutation is a description, not an action. P1.4.1 (state-application layer) consumes the array and writes through to ζ / β state.

§2.4. Context + BudgetTracker

export interface BudgetTracker {
  integer_ops: number;
  call_depth: number;
  current_arg_count: number;
}

export interface Context {
  readonly event: Readonly<Record<string, unknown>>;
  readonly state: Readonly<Record<string, unknown>>;
  readonly rule_version: string;
  readonly epoch: bigint;
  readonly bindings: ReadonlyMap<string, bigint | string | boolean>;
  readonly budget: BudgetTracker;  // mutable counter; everything else readonly
}

The Context interface is read-only at the type level for event, state, rule_version, epoch, and bindings — only budget is mutated during walking (counter increments). The contract requires that the evaluator never writes to event, state, or bindings; binding extensions (e.g. let semantics in future phases) would return a new Context with a new bindings map.

§2.5. Result types

export type RuleResult =
  | { status: 'admitted'; mutations: Mutation[] }
  | { status: 'rejected'; reason: string };

export interface TransitionResult {
  all_mutations: Mutation[];
  per_category_results: Map<Category, RuleResult[]>;
}

reason strings used by the engine are:

  • 'NO_MATCH' — no guard clause matched (and no else clause).
  • 'budget:integer_ops' / 'budget:call_depth' / 'budget:arg_count' — budget exceeded.
  • 'overflow:<details>' — integer-math overflow during arithmetic.
  • 'div_by_zero:<details>' — explicit divide-by-zero.
  • 'undefined_function:<name>'FuncCall with no registered builtin (P1.3.2 will register; pre-P1.3.2, every FuncCall rejects).
  • 'undefined_variable:<path>'VarRef resolves to nothing.
  • 'type_mismatch:<details>' — operator + operand type mismatch (e.g. + on a string).
  • Any reason set by an explicit reject "..." guard clause is passed through verbatim from GuardClause.reason.

Caller code that depends on reason strings should treat the format as stable for the prefix before the colon; the post-colon detail is informational.

§2.6. Typed errors

export class RuleBudgetExceeded extends Error {
  override readonly name = 'RuleBudgetExceeded';
  readonly which: 'integer_ops' | 'call_depth' | 'arg_count';
  readonly limit: number;
  readonly observed: number;
  constructor(
    which: 'integer_ops' | 'call_depth' | 'arg_count',
    limit: number,
    observed: number,
  );
}

The engine throws RuleBudgetExceeded from the deep walker. executeRuleset catches it at the rule boundary and converts it to a 'rejected' result with reason = 'budget:<which>'.

The engine does not export an “EngineError” union; it throws RuleBudgetExceeded for budget overruns and lets OverflowError / DivisionByZeroError propagate from integer-math.js only as far as evaluate(...), which catches them and returns a 'rejected' result. Tests can still assert on the converted reason strings.

§2.7. RuleRegistry interface (consumed; not implemented in this task)

export interface CategorizedRule {
  rule: RuleNode;            // from ./parser.js
  category: Category;
}

export interface RuleRegistry {
  getAll(): readonly CategorizedRule[];
}

P1.2.4 (in-flight as a sibling slice) will implement RuleRegistry. P1.3.1 only consumes the getAll() method; any additional methods (getByName, getByTransitionType) remain optional and outside this contract’s scope.

The interface is declared by the engine, implemented by the registry — same direction as a function signature. The engine has no knowledge of how categories are derived (annotation, naming convention, explicit kind keyword); it only sees the result.

§2.8. Functions

export function evaluate(rule: RuleNode, context: Context): RuleResult;

export function evaluateExpr(
  expr: Expression,
  context: Context,
): bigint | string | boolean;

export function executeRuleset(
  registry: RuleRegistry,
  event: Readonly<Record<string, unknown>>,
  state: Readonly<Record<string, unknown>>,
  rule_version: string,
  epoch: bigint,
): TransitionResult;

Plus internal helpers (not exported): evaluateGuard, collectEffectMutation, resolveVarRef, compareValues, applyBinaryArithmetic, applyBinaryComparison, bumpIntegerOps, bumpCallDepth, freshBudget.

§3. Semantics

§3.1. evaluate(rule, context) — per-rule evaluator

Algorithm (matches extraction §5 pseudocode):

1. For each guard in rule.guards (in declaration order):
     a. bumpIntegerOps(context.budget) — counts the guard clause itself.
     b. If guard.condition === null (else clause): match = true.
        Else: match = evaluateExpr(guard.condition, context) (must be boolean).
     c. If match:
          if guard.action === 'reject': return { status: 'rejected', reason: guard.reason ?? '' }
          else (admit): break — proceed to effects.
2. If no guard matched and we did not break out via admit:
     return { status: 'rejected', reason: 'NO_MATCH' }
3. For each effect in rule.effects:
     a. bumpIntegerOps(context.budget) — counts the effect call itself.
     b. mutation = collectEffectMutation(effect, context)
     c. mutations.push(mutation)
4. return { status: 'admitted', mutations }

Errors thrown during steps 1–3:

  • RuleBudgetExceeded from bumpIntegerOps / bumpCallDepth / arg-count check: evaluate does not catch — propagates up to executeRuleset.
  • OverflowError / DivisionByZeroError from integer-math.js: evaluate does not catch either — same propagation. (executeRuleset is the boundary that converts these to 'rejected' results so that one rule’s overflow doesn’t blow up the whole ruleset.)
  • Error for 'undefined_function', 'undefined_variable', 'type_mismatch': thrown by helpers; converted at the same boundary.

evaluate is pure: it does not write to rule, context.event, context.state, context.bindings. It only mutates context.budget (counter increments) — this is the single permitted side-effect, contained in a stack-local object (the caller of executeRuleset constructs a fresh BudgetTracker per call; see §3.4 below).

§3.2. evaluateExpr(expr, context) — recursive walker

Returns bigint | string | boolean. The result type is determined by the AST node:

AST node Return type Notes
IntLiteral bigint node.value directly.
BoolLiteral boolean node.value directly.
StringLiteral string node.value directly. Strings are leaves only; arithmetic / comparison helpers reject string operands with type_mismatch.
VarRef bigint \| string \| boolean Resolved against context.bindings, then context.event, then context.state (in that order). Throws if absent.
UnaryOp (-) bigint Operand must be bigint; throws type_mismatch otherwise.
BinaryOp (arithmetic +/-/*///%) bigint Both operands must be bigint; uses safe_mul/safe_div/native +/- (with overflow check via safe_mul for products).
BinaryOp (comparison ==/!=) boolean Operands must be same primitive type; == is value equality, never reference.
BinaryOp (comparison </>/<=/>=) boolean Operands must both be bigint; throws type_mismatch for non-bigint operands. (Strings have no ordering semantics in κ; booleans neither.)
LogicalOp (and, or) boolean Both operands must be boolean after evaluation. Short-circuit evaluation is permitted — if left of and is false, right is not evaluated; if left of or is true, same. Determinism is preserved because the budget for skipped subtree is not consumed (matching what real arbiters would do).
LogicalOp (not) boolean Single operand; must evaluate to boolean.
FuncCall bigint \| string \| boolean Throws 'undefined_function:<name>' in P1.3.1 (no built-ins registered until P1.3.2). Args evaluated left-to-right; args.length checked against MAX_ARG_COUNT before any arg is evaluated. bumpCallDepth increments before recursion into args; decrements after.

Every entry into evaluateExpr increments context.budget.integer_ops (then checks the cap). This is the per-node visit count — independent of arithmetic-vs-other distinction. (The “integer ops” name is a heritage label; in practice it counts AST node visits, which is what the upstream pseudocode node_budget counts.)

§3.3. executeRuleset(registry, event, state, rule_version, epoch) — orchestrator

1. budget0 = freshBudget()  — fresh tracker for the entire executeRuleset call.
2. allRules = registry.getAll()
3. groups: Map<Category, RuleNode[]> = group by category
4. For each category in CATEGORY_ORDER (Admission → StateTransition → Consequence → Promotion):
     a. rules = groups.get(category) ?? []
     b. Sort rules by rule.name with locale-independent ASCII compare.
     c. For each rule in sorted rules:
          i.   ctx = Context with fresh BudgetTracker (per-rule reset, matching extraction §5)
          ii.  try { result = evaluate(rule, ctx) }
               catch (RuleBudgetExceeded e) { result = { status: 'rejected', reason: 'budget:' + e.which } }
               catch (OverflowError e) { result = { status: 'rejected', reason: 'overflow:' + e.message } }
               catch (DivisionByZeroError e) { result = { status: 'rejected', reason: 'div_by_zero:' + e.message } }
               catch (Error e) { result = { status: 'rejected', reason: e.message } }
          iii. per_category_results.get(category).push(result)
          iv.  if (result.status === 'admitted') all_mutations.push(...result.mutations)
5. return { all_mutations, per_category_results }

Per-rule budget reset matches extraction §5’s node_budget = 0 line at the start of execute_rule. The contract chooses this over a per-executeRuleset shared budget because (a) extraction §5 says so, (b) it gives each rule a fair shot regardless of how many sibling rules ran before it, and (c) it matches the concept doc’s “per rule” wording in §Default budget constants.

The budget0 from step 1 is intentionally unused — kept as a placeholder anchor for any future ruleset-wide budget that Phase 2+ may add. (Removing it now and re-introducing later would be a contract change.) Update during impl: prefer to not allocate budget0 if it remains unused, to keep the engine surface minimal. The packet will document the final decision.

§3.4. Determinism contract (load-bearing)

Two arbiters with the same (rule_version, registry, event, state, epoch) must produce bit-identical TransitionResult.all_mutations:

  1. CATEGORY_ORDER is a constant.
  2. Within each category, rules are sorted by rule.name with String.prototype.localeCompare(other, 'en', { sensitivity: 'variant' }) ⇒ this is locale-dependent. Use a < b ? -1 : a > b ? 1 : 0 instead — pure ASCII string comparison, guaranteed stable across JS engines.
  3. The evaluate walker visits nodes in the order they appear in the AST (which itself is deterministic from the parser).
  4. Map/Set iteration order is insertion order in modern V8 (ECMA-262 spec); the engine must not rely on Object.keys over a record-style object for ordering.
  5. Short-circuit eval for and/or is deterministic given (a) deterministic operand order, (b) deterministic operand value: both hold by construction.

The inspectFunctionForbidden self-scan from determinism.ts will be applied to evaluate in the test suite — empty hits is required.

§4. Invariants

# Statement Enforcement
I1 Pure module — no I/O, no DB, no network, no env reads, no console writes Code review + determinism harness self-scan in tests.
I2 Pure functions — no writes to rule, context.event, context.state, context.bindings Frozen-input test (F5) — pass Object.freeze‘d state and event; if engine writes, JS throws TypeError in strict mode (TS modules are strict-by-default).
I3 Determinism — same inputs ⇒ bit-identical outputs across N runs Exposed via assertDeterministic(evaluate, [rule, ctx], { iterations: 10 }) in tests.
I4 All bigint arithmetic uses integer-math.js for products; never bare * on bigints when overflow is plausible Code review; tests assert OverflowError propagates correctly.
I5 Budget caps fire as RuleBudgetExceeded with the right which Per-cap test; which value asserted.
I6 First-match-wins guard order F2 fixture asserts.
I7 Category execution order = CATEGORY_ORDER F3 fixture asserts.
I8 Alpha sort within category is locale-independent Test with names b, A, c ⇒ ASCII sort = A first (codepoint 65 < 98 < 99); locale sort would push A after a-b.
I9 Mutations collected, never applied F5 fixture; engine test never re-reads state after evaluate.
I10 evaluate(rule, ctx) mutates only ctx.budget Direct test: snapshot ctx properties before, deep-equal after.

§5. Forbiddens (axiomatic)

# Forbidden Why
F1 Math.* Determinism (Math.random); float semantics.
F2 Date.*, new Date() Clock reads break consensus.
F3 setTimeout / setInterval / setImmediate Async timers; non-deterministic ordering.
F4 await / async function Engine is sync; async breaks budget tracking and determinism.
F5 crypto.*, process.hrtime, process.nextTick Same as F2.
F6 Float literals (e.g. 3.14) Integer-only. The determinism scanner enforces this.
F7 JSON.parse on user input Engine doesn’t parse text — parser does.
F8 Object.assign(state, ...) / state.foo = bar etc. Purity invariant.
F9 Bare * on two bigints (where product may exceed int64) Use safe_mul.
F10 Throw plain Error for budget overruns Use RuleBudgetExceeded.
F11 Mutate the AST passed in Tree is shared across many evaluations.
F12 Sort with localeCompare Locale-dependent; use ASCII compare.

§6. Out-of-scope (deferred)

Item Owner
Built-in function bodies (min, max, isqrt, bps_mul, etc.) P1.3.2
Effect application (writing mutations through to state) P1.4.1
Conflict detection across mutations P1.4.1
Merkle proof generation over (state_root, new_state_root, mutations) P1.4.1 / η
Rule classification (deciding which rule is Admission vs Promotion) P1.2.4
Specificity-based ordering (extraction §6 says alpha; concept doc says guard-term-count) Engine uses alpha within category per extraction §6; specificity is registry-level concern (P1.2.4).
Validation that RuleNode is well-formed (StringLiteral not in arithmetic position, VarRef paths exist) P1.2.3 (validator)
audit_session_start / thought_record integration Out of κ scope.

§7. Failure & error model

Source Behavior
Budget cap exceeded inside evaluate Throws RuleBudgetExceeded (typed). executeRuleset catches and converts to { rejected, reason: 'budget:<which>' }.
Integer-math overflow during evaluation Throws OverflowError from safe_mul / safe_div. executeRuleset catches and converts to { rejected, reason: 'overflow:...' }.
Divide-by-zero Throws DivisionByZeroError. executeRuleset catches → 'div_by_zero:...'.
VarRef resolves to undefined Throws plain Error with message 'undefined_variable:<path>'. executeRuleset catches → reason is the message.
FuncCall with unknown name Throws plain Error with message 'undefined_function:<name>'. Same conversion.
Type mismatch on operator Throws plain Error with message 'type_mismatch:<details>'. Same conversion.
Explicit reject "..." guard Returns { rejected, reason: <verbatim from rule> } directly from evaluate; no exception.

evaluate itself does not catch errors — it lets typed errors propagate; the catch boundary is executeRuleset. This makes evaluate testable in isolation: tests can assert expect(() => evaluate(rule, ctx)).toThrow(RuleBudgetExceeded) for budget cases, and expect(evaluate(...)).toEqual(...) for non-throw cases.

§8. Acceptance criteria → contract clauses

Acceptance criterion (dispatch packet) Contract clause
Recursive AST walker with immutable context §2.4 (Context readonly types), §3.2 (walker returns), I2
Execution order: Admission → StateTransition → Consequence → Promotion §2.2 (CATEGORY_ORDER), §3.3 step 4, I7
Within each category: alphabetical by rule name (stable) §3.3 step 4b, §3.4 point 2, I8
First-match-wins guard evaluation §3.1 step 1, I6
Mutations collected, not applied during evaluate §3.1 step 3, I9
MAX_INTEGER_OPS=10000RuleBudgetExceeded("integer_ops") §2.1, §2.6, §3.2 walk-bumps, §3.3 catch boundary
MAX_CALL_DEPTH=16RuleBudgetExceeded("call_depth") Same as above; check fires before recursion into FuncCall
MAX_ARG_COUNT=8RuleBudgetExceeded("arg_count") Same; check fires before evaluating any FuncCall arg
evaluate is pure I2 + frozen-input test fixture F5
npm run build && npm run lint && npm test ALL THREE green Verification doc
No regressions on 1467-test baseline Verification doc baseline check

Ready for Step 3 (Packet). Signed-off contract for engine.ts. Imports declared, types pinned, semantics specified, error model decided, determinism load-bearing.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.