P1.3.4 — κ Policy Gating / Pre-guards — Execution Packet
Step 3 of the 5-step executor chain. Builds on
docs/audits/p1-3-4-policy-gate-audit.mdanddocs/contracts/p1-3-4-policy-gate-contract.md. Sequences the implementation in Step 4. The packet gates implementation — Step 4 must not begin until Step 3 lands per CLAUDE.md §6.
§P0. Plan summary
Two new files, ~250 LOC + ~250 LOC of tests. Strict-additive: no existing source touched. Five-pass implementation:
- P1: Module skeleton — types, enum, helpers.
- P2: POLICIES table + module-init parser pass.
- P3:
check_policycore. - P4:
check_all_policiesshort-circuit loop. - P5: Test suite (10 fixtures, ~50 cases).
§P1. Module skeleton
src/domains/rules/policy-gate.ts:
/**
* Colibri — Phase 1 κ Rule Engine — Policy Gating / Pre-guards (P1.3.4).
*
* P1–P13 constitutional pre-guards that run BEFORE named rule evaluation in
* the admission flow (P1.4.1 next wave wires this in). A policy-gate denial
* means the action never enters rule evaluation, never produces mutations,
* and never lands in the audit trail as "considered". Policies are the
* cheapest filter (early exit) AND legitimacy-bearing (constitutional, not
* contingent).
*
* Pure module — no I/O, no DB access, no network, no env reads, no console
* output, no clock, no RNG, no async. Reuses the P1.2.2 parser + P1.3.1
* evaluator end-to-end (no separate evaluator).
*
* Public surface (per docs/contracts/p1-3-4-policy-gate-contract.md §2):
* - PolicyId enum (P1..P13, string-valued)
* - Policy interface
* - PolicyResult discriminated union
* - POLICIES table (frozen Record<PolicyId, Policy>)
* - check_policy(id, actor, context): PolicyResult
* - check_all_policies(action_name, actor, context): PolicyResult
*
* Canonical references:
* - docs/reference/extractions/kappa-rule-engine-extraction.md §9
* - docs/audits/p1-3-4-policy-gate-audit.md
* - docs/contracts/p1-3-4-policy-gate-contract.md
* - docs/packets/p1-3-4-policy-gate-packet.md (this)
*/
import { parse } from './parser.js';
import type { Expression } from './parser.js';
import { evaluateExpr } from './engine.js';
import type { Context } from './engine.js';
// §1. PolicyId enum
export enum PolicyId { P1 = 'P1', /* ... */ P13 = 'P13' }
// §2. Policy interface
export interface Policy { /* per contract §2.2 */ }
// §3. PolicyResult union
export type PolicyResult = { /* per contract §2.3 */ };
// §4. internal helpers (parsePredicate, makeContext, …)
// §5. POLICIES — module-init seed
export const POLICIES: Readonly<Record<PolicyId, Policy>> = /* ... */;
// §6. POLICY_ORDER — declaration-order array for deterministic iteration
const POLICY_ORDER: readonly PolicyId[] = [PolicyId.P1, ..., PolicyId.P13] as const;
// §7. check_policy
export function check_policy(...): PolicyResult { /* ... */ }
// §8. check_all_policies
export function check_all_policies(...): PolicyResult { /* ... */ }
Acceptance: file compiles standalone (no test references); all types exported.
§P2. POLICIES seed implementation
The 13 stubs use "true" predicate source. Module-init parses each via parse(...) and extracts the first rule’s first guard’s condition — but parse produces RuleNode[], and a bare "true" is not a valid rule (it expects rule NAME { guards { … } effects { … } }).
Resolution: wrap predicate sources in a synthetic rule template at parse time:
function parsePredicate(predicate_source: string): Expression {
// The κ parser entry point parses full rules. To extract a bare expression
// we wrap the source in a one-rule, one-guard, no-effect template, then
// pluck the guard's condition. This synthesizes a valid RuleNode every time
// the predicate text is itself a valid expression.
const wrapped = `rule __policy_synthetic { guards { ${predicate_source} -> admit } effects {} }`;
const parsed = parse(wrapped);
if (parsed.errors.length > 0) {
const messages = parsed.errors.map(e => `${e.kind}:${e.message}`).join('; ');
throw new Error(
`policy-gate module-init: predicate ${JSON.stringify(predicate_source)} failed to parse: ${messages}`,
);
}
if (parsed.ast.length !== 1) {
throw new Error(`policy-gate module-init: expected 1 rule, got ${parsed.ast.length}`);
}
const rule = parsed.ast[0]!;
if (rule.guards.length !== 1) {
throw new Error(`policy-gate module-init: expected 1 guard, got ${rule.guards.length}`);
}
const guard = rule.guards[0]!;
if (guard.condition === null) {
throw new Error(`policy-gate module-init: predicate produced an else-clause; expected condition`);
}
return guard.condition;
}
This synthesizes a valid κ DSL ruleset around the bare predicate, then re-extracts. It costs one parse per policy at module load — 13 parses total, all ≤ 100 chars each, well below MAX_AST_NODES_PER_RULE. The cost is paid once.
Acceptance: all 13 entries in POLICIES have valid predicate_ast; module imports without error.
§P3. check_policy core
const EMPTY_BINDINGS: ReadonlyMap<string, never> = new Map();
function makeContext(
actor: Readonly<Record<string, unknown>>,
context: Readonly<Record<string, unknown>>,
): Context {
return {
event: { actor },
state: context,
rule_version: 'policy-gate',
epoch: 0n,
bindings: EMPTY_BINDINGS,
budget: { integer_ops: 0, call_depth: 0, current_arg_count: 0 },
};
}
export function check_policy(
id: PolicyId,
actor: Readonly<Record<string, unknown>>,
context: Readonly<Record<string, unknown>>,
): PolicyResult {
const policy = POLICIES[id];
const ctx = makeContext(actor, context);
let value: unknown;
try {
value = evaluateExpr(policy.predicate_ast, ctx);
} catch {
return { admitted: false, reason: 'POLICY_EVAL_ERROR' };
}
if (value === true) {
return { admitted: true };
}
if (value === false) {
return { admitted: false, reason: policy.rejection_reason };
}
return { admitted: false, reason: 'POLICY_TYPE_MISMATCH' };
}
Acceptance:
check_policy(PolicyId.P1, {}, {})→{admitted: true}(predicate is"true").- A test policy with predicate
"false"→{admitted: false, reason: <its reason>}. - A test policy with predicate that throws (e.g.
"$undefined.x > 0") →{admitted: false, reason: 'POLICY_EVAL_ERROR'}. - A test policy with predicate that evaluates to a bigint (e.g.
"42") →{admitted: false, reason: 'POLICY_TYPE_MISMATCH'}.
§P4. check_all_policies short-circuit loop
const POLICY_ORDER: readonly PolicyId[] = [
PolicyId.P1, PolicyId.P2, PolicyId.P3, PolicyId.P4,
PolicyId.P5, PolicyId.P6, PolicyId.P7, PolicyId.P8,
PolicyId.P9, PolicyId.P10, PolicyId.P11, PolicyId.P12,
PolicyId.P13,
] as const;
export function check_all_policies(
action_name: string,
actor: Readonly<Record<string, unknown>>,
context: Readonly<Record<string, unknown>>,
): PolicyResult {
for (const id of POLICY_ORDER) {
const policy = POLICIES[id];
const applies =
policy.applicable_actions.indexOf('*') !== -1 ||
policy.applicable_actions.indexOf(action_name) !== -1;
if (!applies) {
continue;
}
const result = check_policy(id, actor, context);
if (!result.admitted) {
return result;
}
}
return { admitted: true };
}
Acceptance:
- Iteration is in P1..P13 order (verified via spy test).
- First failure short-circuits — fixture chains a failing P3 with admitting P4..P13; only the first 3 are evaluated.
- Empty applicable set returns
{admitted: true}(action_name not in any policy’s list AND no'*').
§P5. Test suite
src/__tests__/domains/rules/policy-gate.test.ts. ~50 cases organized as 10 fixtures (matrix from contract §3):
F1 — Module-init invariants (5 cases)
- F1.1: All 13 PolicyIds present in POLICIES.
- F1.2: Every Policy.predicate_ast is non-null and is an Expression node (has
.typefield matching one of the 8 Expression node types). - F1.3: Every Policy.rejection_reason is non-empty string.
- F1.4: All 13 rejection_reason strings are distinct.
- F1.5: POLICIES is frozen (Object.isFrozen returns true).
F2 — check_policy totality (4 cases)
- F2.1: Each of the 13 stubs returns
{admitted: true}for empty actor + context. - F2.2: check_policy never throws on any of {empty, populated, missing-key, deeply-nested} actors.
- F2.3: check_policy with each PolicyId returns the exact same shape (admitted: true).
- F2.4: check_policy is deterministic — 100 calls with same inputs return same result (sample test).
F3 — Short-circuit semantics (5 cases)
- F3.1: Build a synthetic POLICIES override where P3 fails, P4..P13 succeed. Use a spy on evaluateExpr (via fixture) — assert P4..P13’s predicate ASTs are NOT visited.
- Spy mechanism: since check_policy is a closed-over function reference, we test short-circuit via a wrapper that builds a synthetic policy table replacement at the test level:
check_all_policiesagainst an ad-hoc registry. But the contract uses module-levelPOLICIES— we cannot swap it. - Refined approach: add a test-only injection point — NO. The contract forbids dynamic policy tables.
- Final approach: test short-circuit indirectly. Use fixtures with predicate sources that count their evaluations via context bindings.
- Predicate text:
"$counter.value > 0"would force engine evaluation of VarRef. - But context is read-only. We can’t increment a counter from inside a pure evaluator.
- Predicate text:
- Actual approach: test short-circuit via the ordering of rejection reasons. Build a second seed table inside the test by importing a test-only export:
parsePredicate(helper exported under a__test__namespace? No — that violates the contract’s clean public surface). - Real solution: the test fixture sets up a separate test module that builds its own table using the same primitives (
parse,evaluateExpr). It then writes an analogue ofcheck_all_policiesand runs it. This proves the algorithm short-circuits, not the module’s state. We supplement this with a test against the real module showing that two predicates that would evaluate non-trivially to false (P_failer with predicate"false", P_admitter with predicate"$count.x") — but stub policies all have"true". So we cannot test the real POLICIES short-circuit directly today (every stub admits). - Final final approach: export a test helper
__internal_check_all_policies_with_table(table, action, actor, context)that takes a table parameter. The publiccheck_all_policiescalls it withPOLICIESas the table. This makes short-circuit testable against ad-hoc tables. The naming__internal_signals private; downstream callers should not use it.
This is the canonical approach used by other κ slices (e.g. engine.ts has
evaluateandexecuteRulesetthat take all dependencies as parameters;evaluateExpris similarly composable). Adopt it.Updated public surface:
// Public — uses the module's POLICIES table. export function check_all_policies(action_name, actor, context): PolicyResult; // Public — table-parameterized variant for testing AND for future runtime // table-replacement (admin tooling, governance change). Signature: export function check_all_policies_with_table( table: Readonly<Record<PolicyId, Policy>>, order: readonly PolicyId[], action_name: string, actor: Readonly<Record<string, unknown>>, context: Readonly<Record<string, unknown>>, ): PolicyResult;The original
check_all_policies(...)is then a one-line delegate:return check_all_policies_with_table(POLICIES, POLICY_ORDER, action_name, actor, context);.Same pattern for check_policy:
export function check_policy_with_table( table: Readonly<Record<PolicyId, Policy>>, id: PolicyId, actor: Readonly<Record<string, unknown>>, context: Readonly<Record<string, unknown>>, ): PolicyResult;with
check_policy(id, actor, context) = check_policy_with_table(POLICIES, id, actor, context).This doubles the public surface but cleanly enables short-circuit testing without exposing a private
__internal_marker. The_with_tablevariant is also forward-useful for governance-driven policy replacement (deferred). - Spy mechanism: since check_policy is a closed-over function reference, we test short-circuit via a wrapper that builds a synthetic policy table replacement at the test level:
- F3.1 (revised): Build a synthetic table where P_a admits, P_b rejects, P_c admits. Call
check_all_policies_with_tablewith order[a, b, c]. Assert result is P_b’s rejection. Then construct a separate spying mechanism: replace P_c.predicate_ast with a node that throws on construction-then-evaluation — wait, we can’t. Instead: replace P_c’spredicate_astwith one that, if evaluated, would producePOLICY_TYPE_MISMATCH(e.g. an IntLiteral whose value is 42 — evaluates to bigint, not boolean, → reason'POLICY_TYPE_MISMATCH'). If the test result is P_b’s reason (not'POLICY_TYPE_MISMATCH'), we have proven P_c was not visited. - F3.2: Three-policy table where all admit → final result is
{admitted: true}. - F3.3: Three-policy table where the first policy rejects → result is the first’s reason; never enters the loop’s body for indices 1, 2.
- F3.4: Three-policy table where only the last rejects → result is the last’s reason; loop runs to completion.
- F3.5: Empty table (zero entries, empty order array) →
{admitted: true}.
F4 — applicable_actions filtering (4 cases)
- F4.1: Synthetic policy with
applicable_actions: ['Yield']rejects when called with action_name'Yield'; admits when called with'AcceptCommitment'. - F4.2: Policy with
applicable_actions: ['*']rejects for all actions (the standard stub case shape). - F4.3: Policy with
applicable_actions: []is never invoked — its predicate’s failure mode would not affect the result. - F4.4: Mixed table with one
['*']policy and one['Yield']policy — invoked count for'AcceptCommitment'is 1, for'Yield'is 2.
F5 — Iteration order (3 cases)
- F5.1: POLICY_ORDER literal equals
['P1', 'P2', ..., 'P13'](string-comparison). - F5.2: For a synthetic table whose every policy rejects with a unique numeric reason, calling
check_all_policies_with_table([P10, P2, P1], ...)returns P10’s reason (the first-listed-and-applicable). - F5.3: The default
check_all_policiesproduces'P1_NOT_AUTHORIZED'-equivalent — wait, all stubs admit. Different test: assert that POLICY_ORDER is exported (visibility check) and equals[P1..P13].
F6 — Stub permissive baseline (2 cases)
- F6.1:
check_all_policies('Yield', {}, {})→{admitted: true}— confirms admission flow tests upstream don’t break at boot. - F6.2:
check_all_policies('AcceptCommitment', { reputation: { execution: 0n } }, { epoch: 100n })→{admitted: true}.
F7 — Rejection-reason fidelity (4 cases)
- F7.1: Synthetic policy with
predicate_source: "false",rejection_reason: "TEST_REASON_X"→check_policy_with_table(...)returns{admitted: false, reason: 'TEST_REASON_X'}. - F7.2: Same with
rejection_reason: ""(empty) — implementation note: contract I3 forbids empty reasons in module init for the real POLICIES, but synthetic test tables can have any string. Empty is preserved verbatim in the result. - F7.3: Synthetic table with two failing policies — only the first’s reason surfaces (short-circuit confirmation).
- F7.4: rejection_reason contains special chars (
"REASON: with quotes 'and' newlines\n") — preserved verbatim.
F8 — Type-mismatch translation (3 cases)
- F8.1: predicate
"42"(parses as IntLiteral, evaluates to42n) →{admitted: false, reason: 'POLICY_TYPE_MISMATCH'}. - F8.2: predicate
"\"hello\""(StringLiteral, evaluates to"hello") — wait, top-level expression cannot be a bare string in our DSL guard syntax. Verify via parsing test:parsePredicate('"hello"')succeeds because the parser’seffectArgrule allows StringLiteral, but does the expression rule? Looking at parser.ts:expression → orExpr → andExpr → … → primary, and primary doesn’t have a StringLiteral alternative. SoparsePredicate('"hello"')would fail at parse time. Skip this case — the test for non-bool non-bigint values is unreachable via the DSL surface. Update F8 to be 1 case (just the IntLiteral test). - F8.2 (replacement): a more complex arithmetic predicate
"1 + 2"→ bigint result → POLICY_TYPE_MISMATCH.
F9 — Eval-error translation (5 cases)
- F9.1: predicate
"$undefined.var > 0"— undefined VarRef → engine throwsError('undefined_variable:undefined.var')→{admitted: false, reason: 'POLICY_EVAL_ERROR'}. - F9.2: predicate
"unknown_func(1)"— undefined FuncCall → engine throws →'POLICY_EVAL_ERROR'. - F9.3: predicate
"1 / 0"— DivisionByZeroError →'POLICY_EVAL_ERROR'. - F9.4: predicate that overflows multiplication, e.g.
"9223372036854775807 * 2"(max int64 * 2) — OverflowError →'POLICY_EVAL_ERROR'. - F9.5: predicate that exhausts integer_ops budget — synthesize a deeply nested
"$x and $x and $x and …"chain with > 10000 nodes via test-time policy_ast construction (without going through parser, since parser caps at 10000). Use a hand-built AST.- Easier alternative: craft a synthetic Policy whose
predicate_astis a hand-built deeply-nested LogicalOp tree with> 10000AST nodes; check_policy_with_table catches the RuleBudgetExceeded →'POLICY_EVAL_ERROR'.
- Easier alternative: craft a synthetic Policy whose
F10 — Determinism / no-leak self-scan (3 cases)
- F10.1:
inspectFunctionForbidden(check_policy)returns[]. - F10.2:
inspectFunctionForbidden(check_all_policies)returns[]. - F10.3: Read
policy-gate.tssource viafs.readFileSyncin the test (engine test files do this) and assert it importsevaluateExprfrom./engine.js. This satisfies I15 (reuse, not re-implement).- Pattern check: verify with regex
/from\s+'\.\/engine\.js'/— must match. - Negative check: assert no local re-implementation by scanning for substrings characteristic of an evaluator (
switch (expr.type),case 'BinaryOp', etc.).
- Pattern check: verify with regex
Optional: F11 — Module-init failure mode (1 case)
- F11.1:
parsePredicate("rule X")(intentionally invalid DSL) throws Error with descriptive message. Confirms the throw-at-init contract behavior. Caveat: the function is not exported (per contract §6); accessing it from tests requires either exporting it under__test_only__or restructuring. Decision: exportparsePredicatefor testability. Adds it to public surface in §2 of the contract — minor amendment to the contract, documented inline in policy-gate.ts.
§P6. Implementation order
- Write skeleton with PolicyId enum + Policy interface + PolicyResult union.
- Write
parsePredicatehelper (export it for testing per §P5 F11). - Build POLICIES seed using parsePredicate.
- Write
check_policy_with_table+check_policy. - Write
check_all_policies_with_table+check_all_policies. - Run
npm run build— should compile clean. - Run
npm run lint— fix any TypeScript ESLint complaints. - Write the 10 test fixtures, run incrementally.
- Run
npm test— all green. git commitper Step 4 (single commit with both source + test).
§P7. Test count budget
10 fixtures × ~5 cases avg = ~50 cases. Add ~10 cases for edge-case coverage (overflow, division-by-zero literal, frozen-table assertions). Final budget: ~60 test cases. Test count delta from 1658 → ~1718. Memory note pattern: “+~60 tests” in writeback.
§P8. Verification gate (Step 4 → Step 5 transition)
Step 4 is complete when ALL of:
npm run buildexits 0 with no TypeScript errors.npm run lintexits 0 with no ESLint errors.npm testexits 0 with all 60+ new tests passing AND zero regressions in pre-existing 1658.inspectFunctionForbidden(check_policy)returns[](asserted in test).inspectFunctionForbidden(check_all_policies)returns[].
Step 5 records test evidence in the verification doc. Writeback follows.
§P9. Risk register
| Risk | Likelihood | Mitigation |
|---|---|---|
Parser surprise: parsePredicate('"true"') fails because expressions can’t be bare strings. |
Confirmed — bare strings ARE NOT primary AST entries (per parser.ts §5 — Literals.StringLiteral is an alt only inside effectArg). The synthetic-rule wrapper sidesteps this — "true" is parsed as a guard condition (where it’s a BoolLiteral keyword via Keywords.True). |
Already addressed in §P2 design. |
Engine error sub-typing leaks into POLICY_EVAL_ERROR reason. |
Low — the catch is a bare catch {}, so no error data leaks. |
Verified in §P3 implementation block. |
Iteration order via Object.values(POLICIES) returns insertion order in modern V8 — but enum-string keys may be reordered if the runtime treats them as integer-like ("P1" is non-numeric so safe; "P10" is also non-numeric). |
Low — strict use of explicit POLICY_ORDER array sidesteps this entirely. | Already addressed: never iterate Object.values(POLICIES); always iterate POLICY_ORDER. |
evaluateExpr requires non-empty bindings Map for some paths. |
None — bindings are optional and resolveVarRef checks bindings.has(head) only when path.length === 1; multi-segment paths go through event/state. |
Verified by reading engine.ts §2 resolveVarRef. |
| Rebase pain when state-access.ts merges first. | None — disjoint files; no symbol overlap. | Confirmed. |
Test hand-built AST for budget exhaustion — evaluateExpr cap is 10_000. Building a 10_001-node nested LogicalOp at test time may be slow. |
Low — building once, evaluating once. ~50 ms expected. | Use a loop to build nested LogicalOp{op: 'and', operands: [accumulator, BoolLit_true]}. |
Step 3 of 5 complete. Step 4 (implement) begins now.