P1.2.2 — κ DSL Parser — Execution Packet
Step 3 of the 5-step executor chain. Builds on audit + contract. Packet gates implementation: once it lands, Step 4 is a mechanical translation of this plan.
§P1. Plan at a glance
Deliver, in a single implementation commit (Step 4):
src/domains/rules/parser.ts— one TypeScript file, ~450–500 lines including JSDoc, AST interfaces, Chevrotain parser class, AST-cap walker, and theparse()entry point.src/__tests__/domains/rules/parser.test.ts— one Jest file, ~480–520 lines. Target 35–50 cases across 7 fixture groups.
No new dependencies (chevrotain@11.0.3 already pinned by P1.2.1). No edits to any file outside the two listed above. No barrel index.ts for src/domains/rules/.
§P2. File layout (final)
src/domains/rules/
parser.ts # NEW — only file in this task
src/__tests__/domains/rules/
parser.test.ts # NEW — colocated per repo convention
Existing files in src/domains/rules/ are untouched: bps-constants.ts, determinism.ts, integer-math.ts, lexer.ts.
§P3. parser.ts — structural sketch
The source file is organised into the following sections, in order:
§P3.1. Header JSDoc (~40 lines)
- File role + Greek letter κ + Phase 1 Wave 3.
- Links to audit, contract, packet, the concept doc, the extraction §1+§2, and the s11/s12 specs.
- Note on the still-missing ADR-006-dsl-grammar (one-line pointer to audit §3).
- “Pure module” declaration — no I/O, no side effects, never throws.
- Cite R83.C lexer custom-pattern caveat (no parser regression).
§P3.2. Imports (~3 lines)
import { EmbeddedActionsParser, EOF } from 'chevrotain';
import type { IToken, ILexingResult, IRecognitionException } from 'chevrotain';
import { tokenize, Keywords, Operators, Delimiters, Literals } from './lexer.js';
§P3.3. Constants (~6 lines)
export const MAX_AST_NODES_PER_RULE = 10_000;
export const MAX_PARSE_ERRORS = 5;
§P3.4. Location helper (~15 lines)
A pure helper to build a Location from start + end tokens. Hardens against tokens with missing endLine/endColumn (Chevrotain guarantees them with positionTracking: 'full', but the lexer’s tokenize wraps the result, so we still null-coalesce defensively for the noUncheckedIndexedAccess strict-mode discipline).
function locOf(start: IToken, end: IToken = start): Location {
return {
startLine: start.startLine ?? 1,
startColumn: start.startColumn ?? 1,
endLine: end.endLine ?? end.startLine ?? 1,
endColumn: end.endColumn ?? end.startColumn ?? 1,
};
}
§P3.5. AST node interfaces (~110 lines)
11 interfaces + 2 union types + 2 result types, exactly as specified in docs/contracts/p1-2-2-parser-contract.md §2.1–§2.2. No methods. Plain data.
§P3.6. String escape decoder (~25 lines)
The lexer’s StringLiteral retains escapes verbatim in image (e.g. "a\\nb" has image === '"a\\nb"', length 7). The parser decodes:
function decodeString(image: string): string {
// image includes outer quotes; strip them.
const inner = image.slice(1, -1);
let out = '';
let i = 0;
while (i < inner.length) {
const ch = inner[i]!;
if (ch === '\\' && i + 1 < inner.length) {
const next = inner[i + 1]!;
switch (next) {
case 'n': out += '\n'; break;
case 't': out += '\t'; break;
case 'r': out += '\r'; break;
case '\\': out += '\\'; break;
case '"': out += '"'; break;
default: out += next; // fallback — preserve as-is
}
i += 2;
} else {
out += ch;
i += 1;
}
}
return out;
}
§P3.7. The KappaParser class (~250 lines)
A single Chevrotain EmbeddedActionsParser subclass. Each public/private method maps 1:1 to an EBNF production. this.RULE(...) returns the AST node directly (or undefined during error recovery — handled at parent level).
Class skeleton (RULE bodies elided here; see §P3.7.1 for the precedence spine):
class KappaParser extends EmbeddedActionsParser {
constructor() {
super(allTokensRefs, {
recoveryEnabled: true,
maxLookahead: 3, // sufficient for the grammar
});
this.performSelfAnalysis();
}
// top-level
public ruleset = this.RULE('ruleset', (): RuleNode[] => { ... });
// structural
private rule = this.RULE('rule', (): RuleNode | undefined => { ... });
private guardBlock = this.RULE('guardBlock', (): GuardClause[] => { ... });
private guardClause = this.RULE('guardClause', (): GuardClause | undefined => { ... });
private action = this.RULE('action', (): { kind: 'admit' | 'reject'; reason: string | null } => { ... });
private effectBlock = this.RULE('effectBlock', (): EffectCall[] => { ... });
private effectCall = this.RULE('effectCall', (): EffectCall | undefined => { ... });
// expression spine (precedence stratified)
private expression = this.RULE('expression', (): Expression | undefined => this.SUBRULE(this.orExpr));
private orExpr = this.RULE('orExpr', (): Expression | undefined => { ... });
private andExpr = this.RULE('andExpr', (): Expression | undefined => { ... });
private notExpr = this.RULE('notExpr', (): Expression | undefined => { ... });
private comparison = this.RULE('comparison', (): Expression | undefined => { ... });
private additive = this.RULE('additive', (): Expression | undefined => { ... });
private multiplicative = this.RULE('multiplicative', (): Expression | undefined => { ... });
private unary = this.RULE('unary', (): Expression | undefined => { ... });
private primary = this.RULE('primary', (): Expression | undefined => { ... });
// leaves
private variable = this.RULE('variable', (): VarRef | undefined => { ... });
private funcCall = this.RULE('funcCall', (): FuncCall | undefined => { ... });
private intLiteral = this.RULE('intLiteral', (): IntLiteral => { ... });
private boolLiteral = this.RULE('boolLiteral', (): BoolLiteral => { ... });
private stringArg = this.RULE('stringArg', (): StringLiteral => { ... });
}
Rationale for the public/private split: only ruleset is invoked from outside the class (by the parse entry point). Every other method is an internal sub-rule. Marking them private keeps the surface tight.
§P3.7.1. Precedence spine — the load-bearing detail
Each level builds a left-associative chain by iteration (MANY):
private orExpr = this.RULE('orExpr', () => {
let left = this.SUBRULE(this.andExpr) as Expression;
this.MANY(() => {
const op = this.CONSUME(Keywords.Or);
const right = this.SUBRULE2(this.andExpr) as Expression;
left = {
type: 'LogicalOp',
op: 'or',
operands: [left, right],
location: spanLoc(left.location, right.location),
} as LogicalOp;
});
return left;
});
private andExpr = this.RULE('andExpr', () => { /* same shape with Keywords.And */ });
private notExpr = this.RULE('notExpr', () => {
let notTok: IToken | undefined;
this.OPTION(() => { notTok = this.CONSUME(Keywords.Not); });
const inner = this.SUBRULE(this.comparison) as Expression;
if (notTok) {
return {
type: 'LogicalOp', op: 'not', operands: [inner],
location: spanLoc(locOf(notTok), inner.location),
} as LogicalOp;
}
return inner;
});
private comparison = this.RULE('comparison', () => {
const left = this.SUBRULE(this.additive) as Expression;
let op: BinaryOp['op'] | null = null;
let opTok: IToken | null = null;
this.OPTION(() => {
opTok = this.OR([
{ ALT: () => { op = '=='; return this.CONSUME(Operators.Eq); } },
{ ALT: () => { op = '!='; return this.CONSUME(Operators.NotEq); } },
{ ALT: () => { op = '<='; return this.CONSUME(Operators.Lte); } },
{ ALT: () => { op = '>='; return this.CONSUME(Operators.Gte); } },
{ ALT: () => { op = '<'; return this.CONSUME(Operators.Lt); } },
{ ALT: () => { op = '>'; return this.CONSUME(Operators.Gt); } },
]);
});
if (op === null) return left;
const right = this.SUBRULE(this.additive) as Expression;
return {
type: 'BinaryOp', op, left, right,
location: spanLoc(left.location, right.location),
} as BinaryOp;
});
// additive, multiplicative — same iterative shape, op set restricted accordingly.
private unary = this.RULE('unary', () => {
let minusTok: IToken | undefined;
this.OPTION(() => { minusTok = this.CONSUME(Operators.Minus); });
const inner = this.SUBRULE(this.primary) as Expression;
if (minusTok) {
return {
type: 'UnaryOp', op: '-', operand: inner,
location: spanLoc(locOf(minusTok), inner.location),
} as UnaryOp;
}
return inner;
});
§P3.7.2. Primary
Five alternatives. Variable / FuncCall disambiguated by LA(1) (Variable starts with Variable token; FuncCall starts with Identifier and the parser commits when the next token is LParen).
private primary = this.RULE('primary', () => {
return this.OR([
{ ALT: () => this.SUBRULE(this.intLiteral) as Expression },
{ ALT: () => this.SUBRULE(this.boolLiteral) as Expression },
{ ALT: () => this.SUBRULE(this.variable) as Expression },
{
// funcCall — gated on LA(2) === LParen so a bare identifier (which is invalid at this position) parses as funcCall and surfaces a clean error
ALT: () => this.SUBRULE(this.funcCall) as Expression,
},
{
// parenthesised expression
ALT: () => {
this.CONSUME(Delimiters.LParen);
const inner = this.SUBRULE(this.expression) as Expression;
this.CONSUME(Delimiters.RParen);
return inner;
},
},
]);
});
If multiple ALTs start with the same token type, Chevrotain raises a self-analysis error at construction time. Variable / FuncCall do not collide (different start tokens). IntLiteral / BoolLiteral / Variable / FuncCall / ( all have distinct start tokens (IntegerLiteral, True/False, Variable, Identifier, LParen). No GATE needed.
§P3.7.3. funcCall
private funcCall = this.RULE('funcCall', (): FuncCall | undefined => {
const nameTok = this.CONSUME(Literals.Identifier);
this.CONSUME(Delimiters.LParen);
const args: Expression[] = [];
this.OPTION(() => {
const first = this.SUBRULE(this.expression);
if (first) args.push(first);
this.MANY(() => {
this.CONSUME(Delimiters.Comma);
const next = this.SUBRULE2(this.expression);
if (next) args.push(next);
});
});
const rparen = this.CONSUME(Delimiters.RParen);
return {
type: 'FuncCall',
name: nameTok.image,
args,
location: locOf(nameTok, rparen),
};
});
Note the difference from effectCall: effectCall accepts STRING as a top-level argument (per extraction §1 Arg = Expression | STRING); funcCall inside an expression does not. This separation keeps the AST clean — string literals only appear via EffectCall.args and GuardClause.reason.
§P3.7.4. ruleset top-level
public ruleset = this.RULE('ruleset', (): RuleNode[] => {
const rules: RuleNode[] = [];
this.MANY(() => {
const r = this.SUBRULE(this.rule);
if (r) rules.push(r);
});
return rules;
});
§P3.8. AST node counter (~25 lines)
Recursive walker. Pure function over the AST union.
function countNodes(node: AnyNode): number {
let n = 1;
switch (node.type) {
case 'RuleNode':
for (const g of node.guards) n += countNodes(g);
for (const e of node.effects) n += countNodes(e);
break;
case 'GuardClause':
if (node.condition) n += countNodes(node.condition);
break;
case 'EffectCall':
case 'FuncCall':
for (const a of node.args) n += countNodes(a);
break;
case 'BinaryOp':
n += countNodes(node.left) + countNodes(node.right);
break;
case 'UnaryOp':
n += countNodes(node.operand);
break;
case 'LogicalOp':
for (const o of node.operands) n += countNodes(o);
break;
case 'IntLiteral':
case 'BoolLiteral':
case 'StringLiteral':
case 'VarRef':
// leaves
break;
}
return n;
}
§P3.9. The parse entry point (~50 lines)
const parserInstance = new KappaParser();
export function parse(input: string): ParseResult {
const errors: ParseError[] = [];
// 1. Tokenize
const lex: ILexingResult = tokenize(input);
for (const e of lex.errors) {
errors.push({
kind: 'lex',
message: e.message,
location: {
startLine: e.line ?? 1,
startColumn: e.column ?? 1,
endLine: e.line ?? 1,
endColumn: (e.column ?? 1) + Math.max(0, e.length - 1),
},
});
}
// 2. Parse
parserInstance.input = lex.tokens;
const cstOrAst = parserInstance.ruleset();
const rules: RuleNode[] = (cstOrAst ?? []) as RuleNode[];
// 3. Convert parser errors (truncate to first MAX_PARSE_ERRORS)
let parseErrCount = 0;
for (const e of parserInstance.errors) {
if (parseErrCount >= MAX_PARSE_ERRORS) break;
parseErrCount += 1;
errors.push({
kind: 'parse',
message: e.message,
location: e.token && e.token.tokenType !== EOF
? locOf(e.token)
: null,
});
}
// 4. AST-cap pass
const accepted: RuleNode[] = [];
for (const rule of rules) {
const count = countNodes(rule);
if (count > MAX_AST_NODES_PER_RULE) {
errors.push({
kind: 'ast-cap',
message: `Rule '${rule.name}' exceeds maximum AST node count (${count} > ${MAX_AST_NODES_PER_RULE})`,
location: rule.location,
});
} else {
accepted.push(rule);
}
}
return { ast: accepted, errors };
}
The parser instance is module-level (new KappaParser() runs once on first import — the lexer follows the same pattern). Reuse is safe because parserInstance.input = ... resets all internal Chevrotain state.
§P4. Test matrix (parser.test.ts)
Seven fixture groups (F1–F7) plus boundary cases. Each group’s tests are numbered for traceability.
F1 — AcceptCommitment golden fixture (~5 cases)
Translate the AcceptCommitment rule from docs/3-world/physics/laws/rule-engine.md §Worked rule into the extraction §1 block syntax. The rule:
rule AcceptCommitment {
guards {
event.type == "COMMITMENT_REQUEST"
and event.status == "PENDING"
and stake_available($event.actor) >= $event.amount
and reputation_score($event.actor, "commissioning") >= 100
-> admit
else -> reject "no_match"
}
effects {
state_transition($event.id, "PENDING", "ACCEPTED")
stake_freeze($event.actor, $event.amount)
obligation_assign($event.actor, $event.id, $event.deadline)
}
}
Note: the concept-doc rule uses event.type, stake.available(), stake.freeze() — dot-prefixed identifiers. The κ DSL grammar (extraction §1) uses $variable.path for property access and bare identifiers for function calls. The fixture above translates to that style. Documenting this translation in the test docstring.
- F1.1: parse the rule, assert
errors === []andast.length === 1. - F1.2: assert
ast[0].name === 'AcceptCommitment'. - F1.3: assert
ast[0].guards.length === 2(one main guard + oneelse). - F1.4: assert
ast[0].guards[1].condition === null(theelseclause). - F1.5: assert
ast[0].effects.length === 3.
F2 — Operator precedence (~10 cases)
Per task spec: $a + $b * $c == 10 and not $d → And(Eq(Add($a, Mul($b, $c)), 10), Not(Var(d))).
- F2.1: precedence fixture from spec — assert top-level is
LogicalOp{op:'and'}with two operands. - F2.2: assert left operand of and is
BinaryOp{op:'==', left: BinaryOp{op:'+', left: VarRef[a], right: BinaryOp{op:'*', left: VarRef[b], right: VarRef[c]}}, right: IntLiteral 10n}. - F2.3: assert right operand of and is
LogicalOp{op:'not', operands: [VarRef[d]]}. - F2.4:
1 + 2 + 3is left-associative →Add(Add(1,2),3). - F2.5:
1 + 2 * 3→Add(1, Mul(2,3)). - F2.6:
(1 + 2) * 3→Mul(Add(1,2), 3). - F2.7:
not $a or not $b→LogicalOp{op:'or', operands: [Not($a), Not($b)]}. - F2.8:
$a < $b == trueis a parse error (chained comparison). - F2.9: unary
-5→UnaryOp{op:'-', operand: IntLiteral 5n}. - F2.10:
not not $a→LogicalOp{op:'not', operands: [LogicalOp{op:'not', operands: [VarRef[a]]}]}.
F3 — Malformed input recovery (~7 cases)
- F3.1:
rule X { guards { -> admit } effects { } }— guard clause missing expression. Expect ≥ 1 parse error, no throw. - F3.2:
rule { guards {} effects {} }— missing rule name. ≥ 1 parse error. - F3.3:
rule X { guards {} effects {} extra }— trailing token. ≥ 1 parse error. - F3.4:
(((((((— unbalanced parens (not a rule, but exercises lexer + parser errors). - F3.5: many-error input — synthesize >5 parse errors, assert
errors.filter(e => e.kind === 'parse').length === MAX_PARSE_ERRORS(= 5). - F3.6: lex error (
3.14) inside otherwise-valid rule — assert onelexerror inerrorsand the rule still parses minus the bad token. - F3.7: empty input →
{ ast: [], errors: [] }.
F4 — AST cap (~3 cases)
- F4.1: a rule with 9999 guard nodes parses cleanly (just under the cap).
- F4.2: a rule with > 10000 nodes (synthesized via deeply-nested
((...))expression or many guard clauses) is omitted fromastand surfaces anast-caperror. - F4.3:
MAX_AST_NODES_PER_RULE === 10000constant exposed correctly.
Synthesizer: generate a parens-nested expression like
((( ... 1 + 1 ... )))to control node count exactly. Each+inside parens contributes 1 BinaryOp + 2 IntLiteral = 3 nodes; nesting wraps without adding nodes. Synthesizer documented in test assynthDeepRule(n).
F5 — Round-trip stability (~3 cases)
Per audit §10 / contract §I17 — P1.5.4 canonicalize is unwritten. Use parse(s) twice and assert deep equality.
- F5.1: parse the AcceptCommitment fixture twice; assert
JSON.stringifyofastis identical (after BigInt serialization helper). - F5.2: parse a precedence fixture twice; same.
- F5.3: explicit
// TODO(P1.5.4): replace with parse(canonicalize(parse(s))).ast === parse(s).astcomment in the test — left in code for the next agent.
F6 — Lexer-error pass-through (~3 cases)
- F6.1: input
3.14→ onelexerror (float rejected), no parse errors, empty ast. - F6.2: input
1_000→ onelexerror (underscore int), no parse errors, empty ast. - F6.3: input
@→ onelexerror (unrecognised char), no parse errors, empty ast.
F7 — Boundary / edge cases (~7 cases)
- F7.1:
IntLiteral.valueisbigint—typeof === 'bigint'. - F7.2: variable with deep dot path:
$a.b.c.d.e→VarRef{path: ['a','b','c','d','e']}. - F7.3:
else -> admit(no reason) →GuardClause{condition: null, action: 'admit', reason: null}. - F7.4:
else -> reject "blocked"→GuardClause{condition: null, action: 'reject', reason: 'blocked'}. - F7.5: multiple rules in one input →
ast.length === 2and names match. - F7.6: rule with empty
guards {}and emptyeffects {}blocks parses cleanly (validator’s job to flag, not parser’s). - F7.7: rule named
admissionRule(keyword prefix) —RuleNode.name === 'admissionRule'.
F8 — Pure-function / referential transparency (~2 cases)
- F8.1:
parse(s)called twice with equals→ deep-equal results. - F8.2:
parse('valid')does not mutate any module-level state observable from outside (lightweight check — second call returns equivalent ast even after a separateparse(invalid)call interleaved).
Total: ~40 cases.
§P5. Implementation order (Step 4)
- Header JSDoc + imports + constants — ~5 minutes.
- Location helper + escape decoder — ~10 minutes.
- AST node interfaces + union types +
ParseResult— ~15 minutes. KappaParserskeleton (constructor,performSelfAnalysis) — ~5 minutes.- Precedence spine —
expressiondown throughprimary— ~30 minutes. - Leaves —
intLiteral,boolLiteral,variable,funcCall,stringArg— ~20 minutes. - Structural —
rule,guardBlock,guardClause,action,effectBlock,effectCall,ruleset— ~30 minutes. - AST node counter — ~10 minutes.
parse()entry point — ~15 minutes.parser.test.ts— fixtures F1–F8 — ~90 minutes.- Run
npm run build && npm run lint && npm test; iterate until green — ~30 minutes.
Estimated total wall time: 4 hours.
§P6. Step-4 commit message (template)
feat(p1-2-2-parser): chevrotain parser + 11-node AST
Step 4 of 5 — implements P1.2.2 per the contract. Adds:
- src/domains/rules/parser.ts — Chevrotain EmbeddedActionsParser, 11
AST node types (RuleNode, GuardClause, EffectCall, BinaryOp, UnaryOp,
LogicalOp, IntLiteral, BoolLiteral, StringLiteral, VarRef, FuncCall),
parse() entry point, AST-cap walker.
- src/__tests__/domains/rules/parser.test.ts — ~40 Jest cases across 8
fixture groups (AcceptCommitment golden, precedence, malformed input
recovery, AST cap, round-trip stability, lexer error pass-through,
boundary cases, referential transparency).
Grammar matches kappa-rule-engine-extraction.md §1 EBNF (the
authoritative superset). Operator precedence stratified across
orExpr/andExpr/notExpr/comparison/additive/multiplicative/unary/primary
productions. recoveryEnabled: true; first 5 parse errors reported.
AST cap = 10000 nodes per rule, enforced at parse time via post-parse
recursive walker. parse() never throws; all errors flow through
ParseResult.errors with kind: 'lex' | 'parse' | 'ast-cap'.
Unblocks P1.2.3 (validator), P1.2.4 (registry), P1.3.1 (evaluator),
P1.5.4 (canonical serializer).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
§P7. Test gate (CLAUDE.md §5 — ALL THREE green)
cd .worktrees/claude/p1-2-2-parser
npm run build && npm run lint && npm test
Pre-existing test count on main: ~1409. Expected post-merge: ~1449 (+40 parser cases). Zero existing-test regressions.
Known-flake from memory (startup subprocess smoke) — rerun once if it fails; document if it fails twice.
§P8. Step-5 (verify) plan
docs/verification/p1-2-2-parser-verification.md will record:
- The exact build/lint/test command output (test counts, suite names).
- The 11 AST types verified at runtime via the type discriminant matrix.
- The precedence fixture (F2.1) AST in pretty-print form for visual inspection.
- The AST-cap test result with the synthesized rule node count.
- The drift items handled (ADR-006-dsl-grammar still missing — re-noted).
- PR / branch / commit metadata for the writeback record.
§P9. Risks (from audit §11) — mitigations baked into this packet
- EmbeddedActions self-analysis warnings: the Chevrotain parser’s
performSelfAnalysis()is called explicitly in the constructor; any ambiguity surfaces at module load time, not at parse time. Test F8.1 (referential transparency) doubles as a smoke check. noUncheckedIndexedAccessstrictness: everytokens[i]access in this packet uses[i]!after an explicit length check orOPTIONguard. The AST-walker usesfor…ofover arrays, never indexed access.- BigInt overflow at parse time:
BigInt(text)accepts arbitrary digit sequences and produces abigint; overflow vs.MAX_INT64is a downstream (P1.2.3 / P1.3.1) concern. Documented in test F7.1 narrative. - Variable token decoding: the lexer’s
Variablematches\$[A-Za-z_][A-Za-z0-9_]*(?:\.[A-Za-z_][A-Za-z0-9_]*)*. The parser splits on.and drops the leading$. Test F7.2 covers a 5-segment path.
§P10. Sign-off
Audit + contract reviewed; this packet is internally consistent. Step 4 (implement) may begin once this packet is committed.