R83.C / P1.2.1 — Lexer retry — Verification

Closes R81.B’s deferred κ Rule Engine lexer task. Gates Wave 3 of R83 (P1.2.2 Parser, P1.2.3 AST Validator, P1.2.4 Registry, P1.3.1 Core Evaluation Loop, P1.5.4 Canonical Serialization).

§1. Outcome

All three gates green. 21 previously-failing tests now pass. Lexer suite complete at 84/84. Full suite passes at 1169/1169.

§2. Test evidence

§2.1. Before the fix (repro on feature/r81-b-p1-2-1-lexer @ 1eec2b7c)

$ npm test -- --testPathPattern=rules/lexer
...
Test Suites: 1 failed, 1 total
Tests:       21 failed, 61 passed, 82 total

Failing tests: A3, A4, C1, C2, C3, C5, F1, F2, F3, F4, G({), G(}), I1, I2, I3, J1, J3, K1, K2, K4, L2. Full root-cause analysis in docs/audits/r83-c-lexer-retry-debug-audit.md.

§2.2. After the fix (feature/r81-b-p1-2-1-lexer @ 26beb3f2)

$ npm test -- --testPathPattern=rules/lexer
...
Test Suites: 1 passed, 1 total
Tests:       84 passed, 84 total

Delta: +21 tests passing (all previously failing cases now green) plus +2 new regression tests (C6 long-ASCII, I4 long-CJK) that pin the fix against future Chevrotain upgrades or pattern refactors.

§2.3. Lexer coverage

File        | % Stmts | % Branch | % Funcs | % Lines
lexer.ts    |    100  |   83.33  |   100   |   100

The one uncovered branch is the ?? tok.startOffset fallback on toRejectionError (line 452): unreachable with positionTracking: 'full' because Chevrotain always emits endOffset. Documented in the fix packet §FP3.2 as acceptable.

§3. Three-gate results

§3.1. npm run build — green

$ npm run build
> colibri@0.0.1 build
> tsc

Exit 0, zero output. TypeScript strict-mode compilation clean.

§3.2. npm run lint — green

$ npm run lint
> colibri@0.0.1 lint
> eslint src

Exit 0, zero output. The new custom pattern function uses explicit parameter types (text: string, startOffset: number) and returns [string] | null — no any, no eslint-disable.

§3.3. npm test — green (post-retry)

$ npm test
...
Test Suites: 27 passed, 27 total
Tests:       1169 passed, 1169 total
Snapshots:   0 total
Time:        ~31 s

Pre-existing subprocess-smoke flake observation: under full-suite load, src/__tests__/startup.test.ts sporadically emits empty stderr from its tsx src/server.ts subprocess, failing the regex match on [colibri] starting. The exact same flake is documented in MEMORY.md (“Pre-existing startup — subprocess smoke flakiness under full-suite load — predates Wave H; all 4 R77 executors hit it once, always green on rerun”). Isolated run confirms non-regression:

$ npm test -- --testPathPattern=startup
Test Suites: 1 passed, 1 total
Tests:       40 passed, 40 total

The flake is not a regression introduced by this task; it has been present across R75 H, R75 I, R76, R77, and R81, and is called out as a known blocker in CLAUDE.md and the project memory.

§4. Test count ledger

Milestone Total tests Delta
Pre-R81 baseline (main @ 657d4ef4 excluding integer-math) 1085
R81.B feat commit 1eec2b7c adds lexer.test.ts 1167 +82 lexer tests, 21 failing
R83.C fix 26beb3f2 flips red→green + 2 regressions 1169 +2 regression tests, 0 failing

Note: main @ 657d4ef4 actually reports 1123 tests (post-R81.A integer-math landing, +38 tests). This branch predates that merge because R81.B branched before R81.A landed. On PR merge the two add together cleanly (no directory conflict; integer-math.ts and lexer.ts live side-by-side in src/domains/rules/). Post-merge full-suite target: 1207 tests (1085 + 82 lexer + 2 regression + 38 integer-math).

§5. Files changed

File Status Why
docs/audits/r83-c-lexer-retry-debug-audit.md NEW Root-cause analysis of 21 failing tests (Step 1 continuation).
docs/packets/r83-c-lexer-retry-fix-packet.md NEW Execution plan for the Identifier-pattern swap (Step 3 continuation).
src/domains/rules/lexer.ts MOD Identifier pattern swapped from regex literal to custom function; added IDENTIFIER_REGEX module constant; added line_breaks: false on Identifier.
src/__tests__/domains/rules/lexer.test.ts MOD Added regression tests C6 and I4.
docs/verification/r83-c-lexer-retry-verification.md NEW This document (Step 5).

Zero touches to: package.json, package-lock.json, integer-math.ts, any other source file, any other test file, any other domain.

§6. Preservation of the R81.B work

The task explicitly required keeping the R81.B 5-step chain intact. Verified:

$ git log --oneline origin/main..HEAD
26beb3f2 fix(r83-c-lexer-retry): swap Identifier to custom pattern fn (Chevrotain 11.0.3 u-flag workaround)
b548e74a packet(r83-c-lexer-retry): Unicode regex fix plan
d87b5033 audit(r83-c-lexer-retry): root-cause 22 Chevrotain regex failures
1eec2b7c feat(r81-b-p1-2-1-lexer): Chevrotain 11.0.3 κ DSL lexer (18 keywords, 12 ops, 7 token categories)   ← R81.B
c7e723aa packet(r81-b-p1-2-1-lexer): execution plan + token matrix                                           ← R81.B
6519f43f contract(r81-b-p1-2-1-lexer): DSL lexer contract                                                    ← R81.B
c8050a48 audit(r81-b-p1-2-1-lexer): inventory lexer surface + drift flag                                     ← R81.B

Four R81.B commits untouched; three (soon four, after this doc commits) R83.C commits on top form the debug/fix/verify layer.

§7. Drift status

§7.1. Resolved by this task

  • 21 failing lexer tests → all passing (+2 new regressions).
  • R81 memory drift: “22 failing tests; Chevrotain Unicode regex” → root-caused and fixed; memory should be updated to mark this item resolved.

§7.2. Still open (unchanged by this task)

  • Missing ADR-007-dsl-grammar.md — first flagged in R81.B audit §3. The current task-prompt reference docs/architecture/decisions/ADR-006-dsl-grammar.md still dangles; the concept doc docs/3-world/physics/laws/rule-engine.md line 206 still has the broken link. This task does not write the ADR; it is a follow-up per the R81.B audit. Candidate round: as part of Wave 3 when P1.2.2 Parser ratifies the full grammar surface.
  • Pre-existing subprocess-smoke flake in startup.test.ts under full-suite load. Flake-isolation round still open.

§7.3. Known Chevrotain 11.0.3 quirk (now documented)

The Unicode-u-flag issue is now documented with a concrete workaround in both the R83.C audit (full root-cause) and the Identifier JSDoc in lexer.ts itself. Future pattern authors in this codebase have the fix pattern as reference. If P1.2.2 Parser needs Unicode-aware patterns, the same custom-function idiom applies.

§8. Acceptance-criteria checklist (from packet §FP7)

  • npm run build green.
  • npm run lint green — no any, no eslint-disable.
  • npm test -- --testPathPattern=rules/lexer → 84 passed, 0 failed.
  • Full npm test → 1169 passed (pre-existing subprocess-smoke flake acknowledged, isolated-run green).
  • Identifier token’s pattern is a function that delegates to the IDENTIFIER_REGEX sticky constant.
  • line_breaks: false is set on Identifier.
  • No file other than src/domains/rules/lexer.ts and src/__tests__/domains/rules/lexer.test.ts is touched in the fix commit.
  • Public API of the lexer module is unchanged.

§9. Unblocks

With P1.2.1 closed, R83 Wave 3 is free to dispatch:

  • P1.2.2 — Parser (consumes allTokens + tokenize from lexer.ts).
  • P1.2.3 — AST Validator.
  • P1.2.4 — Registry.
  • P1.3.1 — Core Evaluation Loop.
  • P1.5.4 — Canonical Serialization.

See docs/guides/implementation/task-prompts/p1.1-kappa-rule-engine.md for the next-wave dispatch prompts.

§10. Summary

The R81.B deferred κ lexer work is complete. Chevrotain 11.0.3’s Unicode-u-flag optimiser bug was root-caused and worked around via a one-token change to Identifier (regex-pattern → custom-function-pattern swap). 21 red → 21 green + 2 new regression tests; lexer suite 84/84; full suite 1169/1169; three gates clean.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.