P1.2.2 — κ DSL Parser — Audit

Step 1 of the 5-step executor chain (audit → contract → packet → implement → verify). Builds on the P1.2.1 lexer (src/domains/rules/lexer.ts, R83.C 686ede6b). Greenfield parser plus AST surface.

§1. Surface inventory

§1.1. Target files (greenfield for this task)

Path	Exists at base?	Purpose
`src/domains/rules/parser.ts`	No	Chevrotain `EmbeddedActionsParser` + AST node types
`src/__tests__/domains/rules/parser.test.ts`	No	Jest parser tests (see §1.3 layout reconciliation)

§1.2. Touched but not owned

Path	Delta	Purpose
`src/domains/rules/`	already exists with `bps-constants.ts`, `determinism.ts`, `integer-math.ts`, `lexer.ts` (4 files)	Adding `parser.ts` as a peer — no edits to existing files.
`package.json`	none	`chevrotain@11.0.3` already pinned in `dependencies` from P1.2.1 (`E:\AMS\package.json`) — no version change.
`package-lock.json`	none	No new dependencies.

§1.3. Test-file layout reconciliation

The task prompt places the test at src/domains/rules/__tests__/parser.test.ts. The shipped Phase 0 + Phase 1 convention is tests live under src/__tests__/domains/<name>/, confirmed by inspection at base SHA 6345ba7a:

src/__tests__/domains/rules/{bps-constants,determinism,integer-math,lexer}.test.ts (P1.1.1, P1.1.2, P1.1.3, P1.2.1)
src/__tests__/domains/{router,skills,tasks,proof,trail}/... (Phase 0 axes)

Jest testMatch in jest.config.ts picks both layouts up. To stay consistent with the in-repo κ tests already shipped (the lexer test was placed under src/__tests__/domains/rules/lexer.test.ts for the same reason — see docs/audits/r81-b-p1-2-1-lexer-audit.md §1.3), the parser test will live at:

src/__tests__/domains/rules/parser.test.ts

This is a convention reconciliation, not a spec deviation. The verification doc will re-cite.

§2. Authoritative grammar sources

The task prompt lists six pre-flight reads. One of them (docs/architecture/decisions/ADR-006-dsl-grammar.md) does not exist — see §3 drift finding. For authoritative grammar this parser relies on:

Source	Path	Weight
Heritage extraction, full EBNF	`docs/reference/extractions/kappa-rule-engine-extraction.md` §1	Authoritative superset (per prompt)
Heritage extraction, AST shape	`docs/reference/extractions/kappa-rule-engine-extraction.md` §2	Authoritative AST node list (11 types)
Concept doc, EBNF fragment	`docs/3-world/physics/laws/rule-engine.md` §DSL grammar	Narrower phrasing — concept uses `guard:` / `effects:` prefix syntax; extraction uses `guards { }` / `effects { }` block syntax. Extraction wins (per prompt).
Concept doc, worked rule	`docs/3-world/physics/laws/rule-engine.md` §Worked rule (`AcceptCommitment`)	Realistic fixture for parser tests. The body uses `guard:` style; the test will translate to `guards {}` block style to match the extraction grammar.
DSL spec	`docs/spec/s12-dsl.md`	Load-bearing, high-level.
Rule engine spec	`docs/spec/s11-rule-engine.md`	Load-bearing, semantic level.
Lexer source	`src/domains/rules/lexer.ts`	The token surface this parser binds to.

§3. Drift finding — ADR-006-dsl-grammar still missing

The task prompt asks the agent to read docs/architecture/decisions/ADR-006-dsl-grammar.md for Chevrotain ratification (it is also referenced from the concept doc at docs/3-world/physics/laws/rule-engine.md line 206). This ADR is not in the repo at base 6345ba7a.

Actual ADR-006 in repo: docs/architecture/decisions/ADR-006-executable-meaning.md — different subject.
Other ADRs present at base: ADR-001..009 (no dsl-grammar slot).
The R81.B audit (docs/audits/r81-b-p1-2-1-lexer-audit.md §3) raised this drift; the lexer was implemented using extraction §1 + s11/s12 as the authoritative grammar triad. Same approach taken here.

Scope of this task: note the drift again, do not write the ADR. The follow-up to ratify Chevrotain/grammar in an ADR remains a docs round candidate.

§4. Lexer / parser interface — what the parser binds to

Inspecting src/domains/rules/lexer.ts at base:

Module exports:
- tokenize(input: string): ILexingResult — never throws.
- allTokens: TokenType[] — the priority-ordered registry.
- Bundles: Keywords, Operators, Delimiters, Literals, RejectedLiterals.
- Re-exports: IToken, TokenType, ILexingResult, ILexingError.
Tokens the parser will reference (29 of 39 — non-error, non-whitespace):
- Keywords (12 used by parser; 6 of the 18 are reserved for future κ but not in extraction §1 grammar):
  - Used: Rule, Guards, Effects, Else, And, Or, Not, True, False, Admit, Reject.
  - Reserved/unused at this task: When, Then, If, Admission, Transition, Consequence, Promotion. (See §6 — these will be used by the rule classifier in P1.2.4 / P1.3.1.)
- Operators (12): Eq, NotEq, Lte, Gte, Lt, Gt, Plus, Minus, Mul, Div, Mod, Arrow.
- Delimiters (5 of 7): LBrace, RBrace, LParen, RParen, Comma. (Colon, Dot are not used at the rule-level grammar but Dot is internal to Variable regex.)
- Literals (4): Identifier, Variable, IntegerLiteral, StringLiteral.
Lexer caveats relevant to parser correctness:
1. The R83.C identifier custom-pattern-function escape hatch (Chevrotain 11.0.3 regexp-to-ast does NOT support the Unicode u flag). The parser must not bypass this — it consumes IToken[] already produced; no regex re-engagement needed.
2. The lexer rejects float literals and underscore-separated integers via positioned errors (FLOAT_REJECTED_MESSAGE, UNDERSCORE_INT_REJECTED_MESSAGE). The parser sees only well-typed tokens; it does NOT need to re-detect these.
3. The lexer handles whitespace (Lexer.SKIPPED); the parser sees no whitespace tokens.
4. The lexer’s Variable token’s image is the full $dot.path string — the parser splits on . to populate VarRef.path: string[].
5. IntegerLiteral is unsigned; sign is parser-level via Unary rule.
6. Each IToken carries startLine, startColumn, endLine, endColumn, startOffset, endOffset (lexer constructs with positionTracking: 'full'). The parser uses these to set location on AST nodes.

§5. AST node taxonomy (per extraction §2)

11 node types. Every node carries {type: string discriminant, location: {startLine, startColumn, endLine, endColumn}} plus type-specific fields. Plain data — no classes with behavior (forbidden per task §FORBIDDENS).

#	Node type	Fields (beyond `type` + `location`)	Notes
1	`RuleNode`	`name: string`, `guards: GuardClause[]`, `effects: EffectCall[]`	Top-level rule declaration.
2	`GuardClause`	`condition: Expression \\| null` (null = `else`), `action: 'admit' \\| 'reject'`, `reason: string \\| null` (only set when `action === 'reject'`)	First-match-wins evaluation.
3	`EffectCall`	`function: string`, `args: Expression[]`	Side-effect invocation; semantics live downstream (P1.3.x).
4	`BinaryOp`	`op: '+' \\| '-' \\| '*' \\| '/' \\| '%' \\| '==' \\| '!=' \\| '<' \\| '>' \\| '<=' \\| '>='`, `left: Expression`, `right: Expression`	Arithmetic + comparison.
5	`UnaryOp`	`op: '-'`, `operand: Expression`	Numeric negation only — `not` is `LogicalOp`.
6	`LogicalOp`	`op: 'and' \\| 'or' \\| 'not'`, `operands: Expression[]` (length 2 for and/or, 1 for not)	Boolean logic.
7	`IntLiteral`	`value: bigint`	Integer constant. bigint to match P1.1.1 `integer-math.ts` and P1.1.3 `bps-constants.ts` invariants — the engine is bigint throughout for κ determinism. (Extraction §2 says `int64`; bigint is the JS-side carrier with the engine enforcing the int64 envelope at evaluation time.)
8	`BoolLiteral`	`value: boolean`	`true` or `false`.
9	`StringLiteral`	`value: string`	Decoded string (escapes resolved); `image` retained on the parent token only.
10	`VarRef`	`path: string[]`	E.g. `$actor.reputation` → `path: ['actor', 'reputation']`.
11	`FuncCall`	`name: string`, `args: Expression[]`	Built-in function invocation; semantics live in P1.3.1 evaluator.

Expression is a union of: BinaryOp | UnaryOp | LogicalOp | IntLiteral | BoolLiteral | StringLiteral | VarRef | FuncCall. (StringLiteral is in the union because the extraction’s Arg = Expression | STRING permits string args; it is not valid as a top-level expression in arithmetic / boolean position. The AST permits the type but the validator (P1.2.3) and evaluator (P1.3.1) reject misplaced strings.)

§6. Rule classification — Admission / StateTransition / Consequence / Promotion

The task prompt asks the parser to “parse 4 rule types: Admission, StateTransition, Consequence, Promotion”. The extraction §1 grammar’s Rule = "rule" IDENTIFIER "{" GuardBlock EffectBlock "}" does not carry type information at the syntax level; classification is a downstream concern. From rule-engine.md §Rule Execution Order, the four kinds are categories used by the registry / executor, not grammatical productions.

Decision for this task: the parser produces RuleNode instances; it does not classify them at parse time. Classification by name convention (e.g. prefix Admit*, State*, etc.) or by an explicit attribute (e.g. a kind keyword) is a P1.2.4 (registry) / P1.3.1 (engine) concern. The PR will document this explicitly so reviewers do not flag a missing classifier.

This is consistent with the lexer reserving Admission, Transition, Consequence, Promotion as keywords — they exist in the token stream for future use but the extraction §1 grammar does not consume them yet. They tokenize today; they bind to grammar productions later.

§7. AST cap (10,000 nodes per rule)

The task prompt requires rejection of any single rule with > 10,000 AST nodes at parse time. Two implementation choices:

Choice A — count during parsing, threading state through Chevrotain’s parser DSL. The task prompt §Common Gotchas explicitly cautions against this (“threading state through Chevrotain’s parser DSL mid-parse is brittle”).

Choice B — count after parsing with a recursive walker. The task prompt §Common Gotchas explicitly recommends this (“Walk the final tree with a simple recursive counter”).

Decision: Choice B — post-parse recursive walker countNodes(node: AnyNode): number, called by the public entry point after parseRuleset returns. Rules exceeding the cap produce a synthetic parse-error entry rather than throwing. The cap is exposed as an exported constant MAX_AST_NODES_PER_RULE = 10000 for tests + future ADR.

§8. Error recovery (5-error cap)

Chevrotain’s recoveryEnabled: true is the documented switch for non-fatal parse errors. The task prompt requires first 5 errors reported, doesn’t crash on malformed input. Chevrotain’s errors array on parse() already accumulates all encountered errors; the parser truncates to first 5.

Decision: the parse() entry point returns { ast: RuleNode[], errors: ParseError[] } where errors is the union of:

Lexer errors (passed through from tokenize).
Chevrotain parse errors (truncated to first 5).
AST-cap errors (one per offending rule).

If errors is non-empty, ast may be partial (rules that parsed cleanly still appear; rules that failed contribute nothing). This matches the spec’s “doesn’t crash on malformed input” requirement.

§9. Public API surface (committed by §contract)

The parser module exports — provisional, locked in docs/contracts/p1-2-2-parser-contract.md:

// AST union types — discriminated by `type`
export type Expression =
  | BinaryOp | UnaryOp | LogicalOp
  | IntLiteral | BoolLiteral | StringLiteral
  | VarRef | FuncCall;

export interface Location { startLine: number; startColumn: number; endLine: number; endColumn: number; }
export interface RuleNode { type: 'RuleNode'; location: Location; name: string; guards: GuardClause[]; effects: EffectCall[]; }
// ... 10 more interfaces (one per AST node)

export interface ParseError {
  kind: 'lex' | 'parse' | 'ast-cap';
  message: string;
  location: Location | null;     // null only for non-positioned errors
}

export interface ParseResult {
  ast: RuleNode[];
  errors: ParseError[];
}

export const MAX_AST_NODES_PER_RULE: number;
export const MAX_PARSE_ERRORS: number;            // 5

export function parse(input: string): ParseResult;

No classes. Interfaces only. Plain data. Pure function.

§10. Non-goals

This task explicitly excludes:

AST validator — semantic checks (forbidden ops in expressions, type coherence, function arity). That is P1.2.3.
Rule registry / loader — keying rules by name, looking up by registry id. That is P1.2.4.
Evaluator / interpreter — executing the AST against a context. That is P1.3.1.
Canonical serialization — pretty-printing AST back to DSL text. That is P1.5.4. The round-trip test (Fixture F5) leaves a TODO(P1.5.4) comment where the canonical-serialize call would go; the test asserts parse(s) is structurally stable when re-parsed (i.e. parse twice and assert equal — a weaker but locally testable property).
Rule classification by kind (Admission / StateTransition / Consequence / Promotion) — see §6.
A new ADR — see §3.
Mutating any existing file outside src/domains/rules/parser.ts, src/__tests__/domains/rules/parser.test.ts, and the three docs (audit, contract, packet, verification).
Performance SLOs — none gated; informational only.

§11. Risk register

Risk	Mitigation
Chevrotain LL(k) left-recursion gotcha	EBNF in extraction §1 already iterative (`{ ... }` repetition for binary chains); maps to `MANY` rules in Chevrotain. No left recursion in design.
`EmbeddedActionsParser` vs `CstParser` choice	Pick `EmbeddedActionsParser` per task spec. Chevrotain warns about `EmbeddedActions` in self-analysis; mitigate via `recoveryEnabled: true` plus careful `RULE` definitions returning AST nodes directly.
Operator precedence collapse	Stratified grammar productions (`OrExpr → AndExpr → NotExpr → Comparison → Additive → Multiplicative → Unary → Primary`) per extraction §1; no precedence-table hack.
AST cap counting brittleness	Post-parse walker (Choice B in §7).
BigInt overflow in `IntLiteral` parsing	Use `BigInt(text)` directly; if overflow at parser time is a concern (it is, for `MAX_INT64`-exceeding literals), defer to P1.2.3 validator. The parser stores the bigint as-is.
Round-trip property without P1.5.4 canonicalize	Fixture F5 uses `parse(s)` twice and asserts structural equality (a weaker invariant). Comment marks the upgrade target as P1.5.4.
Cross-worktree leak (memory mentions persistent issue at Wave C)	Strict scope discipline — only the 5 files this task owns are edited. `git status` checked at every commit.
Lexer keywords `Admission`/`Transition`/`Consequence`/`Promotion` unused	Documented as reserved per §6. Tests exercise some of them via prefix-of-identifier (`admissionRule` is a valid Identifier — Chevrotain’s `longer_alt`).
`noUncheckedIndexedAccess` in tsconfig	Care needed when accessing `tokens[i]!` — every parser-internal access is `[i]!` or guarded; AST-walker must check children for `undefined` before recursing.

§12. Estimated implementation

Step	Lines (rough)
`parser.ts` JSDoc + types	~150
`parser.ts` Chevrotain parser class	~200
`parser.ts` AST cap walker + helpers	~50
`parser.ts` `parse()` entry point	~50
`parser.ts` total	~450
`parser.test.ts` AST assertion helpers	~80
`parser.test.ts` 5 fixture groups (F1–F5) + boundary cases	~400
`parser.test.ts` total	~480

Test count target: 35–50 cases (slightly larger than lexer’s 22 because the AST surface is wider).

§13. Pre-flight verification

✅ Worktree created at .worktrees/claude/p1-2-2-parser off origin/main 6345ba7aec8d2507337fa5161928c13d4a3b4d3e.
✅ Branch feature/p1-2-2-parser set up to track origin/main.
✅ chevrotain@11.0.3 already in dependencies (P1.2.1 lockfile inherited).
✅ Lexer module readable; surface mapped (§4).
✅ EBNF read and codified (§5–6).
✅ ADR-006-dsl-grammar drift re-noted (§3).

Next step: contract (Step 2 of 5).