Verification — P4.2.2 Coercion Trap Detector (option-set)
§1. Summary
P4.2.2 ships a pure, deterministic option-set coercion-trap detector at
src/domains/integrity/detectors/coercion.ts + 37 unit tests at
src/domains/integrity/detectors/__tests__/coercion.test.ts. All three
acceptance gates pass:
npm run build— clean (tsc + postbuild copy-migrations).npm run lint— clean (eslint oversrc/).npm test— 37/37 new tests pass; total corpus 3590 / 81 suites.
The 3 failing suites at full-parallel run (server.test.ts,
consensus/parity-harness.test.ts, reputation/tools.test.ts,
scripts/script-invocation-skills.test.ts — the set varies run-to-run) are
pre-existing CI-load flakes documented in memory:
reputation/tools.test.tsparallel-migration race (996_alpha.sqlvs996_bravo.sql)consensus/parity-harness G7.15000ms perf borderline (received 6302ms)server.test.tsparallel bootstrap timing (51/51 pass in isolation)
When run isolated (jest src/__tests__/server.test.ts), each suite passes.
When run with parallel-execution filters excluded
(--testPathIgnorePatterns="reputation/tools" --testPathIgnorePatterns="consensus/parity-harness" --testPathIgnorePatterns="server.test" --testPathIgnorePatterns="script-invocation-skills")
the remaining corpus is 3467/3467 pass (77 suites).
The new coercion suite is NOT a contributor to these flakes — it runs in ~5.5s alone and is fully deterministic.
§2. Build evidence
$ npm run build
> colibri@0.0.1 build
> tsc
> colibri@0.0.1 postbuild
> node scripts/copy-migrations.mjs
copy-migrations: copied 9 migration(s)
E:\AMS\.worktrees\claude\p4-2-2-coercion-detector\src\db\migrations
-> E:\AMS\.worktrees\claude\p4-2-2-coercion-detector\dist\db\migrations
Exit code 0. No diagnostics.
§3. Lint evidence
$ npm run lint
> colibri@0.0.1 lint
> eslint src
Exit code 0. No warnings, no errors.
(One iteration of curly errors at coercion.test.ts L237-238 was caught
during initial implementation — single-line if (action === 'A') return ...
forms violated curly: ["error", "all"]. Fixed in the same commit by
adding braces; final lint is clean.)
§4. Test evidence — new suite
$ jest src/domains/integrity/detectors/__tests__/coercion.test.ts
PASS src/domains/integrity/detectors/__tests__/coercion.test.ts (5.487 s)
detectCoercion — Group 1: empty trigger
√ G1.1 — empty available + empty presented → one advisory (6 ms)
√ G1.2 — empty available + non-empty presented → one advisory; evidence captures the mismatch
detectCoercion — Group 2: all-negative trigger
√ G2.1 — single-option negative → one advisory
√ G2.2 — multiple options all negative → one advisory
√ G2.3 — all negative, some also obligates → ONE advisory with both clauses
detectCoercion — Group 3: all-obligates trigger
√ G3.1 — single-option obligate → one advisory
√ G3.2 — multiple options all obligates → one advisory
√ G3.3 — all obligates, none negative → recommendation mentions obligation only
detectCoercion — Group 4: mixed outcome, no advisory
√ G4.1 — at least one positive, at least one negative → []
√ G4.2 — at least one positive, some obligates → []
√ G4.3 — single positive option → []
√ G4.4 — presented ≠ available; available has a positive option → []
√ G4.5 — presented ≠ available; available filtered to a negative-only set → 1 advisory
detectCoercion — Group 5: combined triggers
√ G5.1 — all negative AND all obligates → one advisory with both clauses
√ G5.2 — empty alone → one advisory; only empty clause
detectCoercion — Group 6: P4.1.1 envelope parse
√ G6.1 — returned advisory parses cleanly through AdvisorySchema
√ G6.2 — decision_hash is exactly 64 lowercase hex characters
√ G6.3 — timestamp_logical can be 0n
detectCoercion — Group 7: evidence content
√ G7.1 — evidence contains three entries: presented, available, outcomes
√ G7.2 — outcomes entries deep-equal the simulated Outcomes
√ G7.3 — evidence outcomes is a defensive copy
detectCoercion — Group 8: determinism
√ G8.1 — 100 runs with identical inputs produce deep-equal Advisory[]
√ G8.2 — decision_hash is stable across 100 runs
√ G8.3 — decision_hash is invariant under timestamp_logical change (dedup invariant)
√ G8.4 — different available sets produce different decision_hashes (dedup discrimination)
detectCoercion — Group 9: purity / adapter contract
√ G9.1 — adapter is invoked exactly once per action; no extra calls
√ G9.2 — propagates exception from deps.admission
√ G9.3 — propagates exception from deps.engine
√ G9.4 — does not invoke scoreCompute (default flow uses engine.reputation_delta directly)
coercion.ts source — Group 10: no clock / RNG / async
√ G10.1 — contains no Date.* / Math.random / performance.now / setTimeout / setInterval / fetch / async / await / crypto.randomBytes
√ G10.2 — contains no float literals
coercion.ts source — Group 11: no direct κ/λ imports
√ G11.1 — does not import directly from rules/admission, rules/engine, reputation/compute
√ G11.2 — imports computeDecisionHash + Advisory from ../schema.js (REUSE)
detectCoercion — Group 12: bigint discipline
√ G12.1 — reputation_delta === 0n is NOT negative; single zero-delta option → no advisory
√ G12.2 — reputation_delta === -1n IS negative
√ G12.3 — mixed zero + positive → no advisory
√ G12.4 — all-zero (delta = 0n) → NO advisory; zero is not negative
Test Suites: 1 passed, 1 total
Tests: 37 passed, 37 total
§5. Test evidence — full corpus
Full npm test (single run, full parallel):
Test Suites: 3 failed, 78 passed, 81 total
Tests: 5 failed, 3585 passed, 3590 total
Time: 80.512 s
After excluding the 4 known load-flake suites:
Test Suites: 77 passed, 77 total
Tests: 3467 passed, 3467 total
Time: 108.784 s
The 5 failures across 3 suites are pre-existing CI-load flakes (§1).
Base SHA 49560518: 3553 tests / 80 suites.
HEAD SHA (post-implement): 3590 tests / 81 suites.
Delta: +37 tests / +1 suite (the new coercion suite).
§6. Acceptance-criteria mapping
| AC# from contract §5 | Verification location | Pass |
|---|---|---|
| AC#1 Three triggers (empty / all-negative / all-obligates) | G1.1, G2.1, G3.1 | ✓ |
| AC#2 Advisory shape matches P4.1.1 envelope | G6.1 — AdvisorySchema.parse(advisory) does not throw |
✓ |
| AC#3 Mixed outcome (positive + negative) → 0 advisories | G4.1 | ✓ |
| AC#4 Single positive option → 0 advisories | G4.3 | ✓ |
| AC#5 Empty available set → 1 advisory; evidence captures both sets | G1.1, G1.2 | ✓ |
AC#6 presented ≠ available; advisory only on degenerate filtered set |
G4.4 (no advisory), G4.5 (advisory) | ✓ |
| AC#7 Determinism ×100 | G8.1, G8.2 | ✓ |
| AC#8 Pure function — mock adapters intercept all access | G9.1 (exact call counts), G9.4 (scoreCompute uninvoked) | ✓ |
| AC#9 Static scanner — no clock / RNG / async / float | G10.1, G10.2 | ✓ |
| AC#10 Static scanner — no direct κ/λ imports | G11.1, G11.2 | ✓ |
| AC#11 Never throws on degenerate input | G1.1, G2.1, G3.1 (no expect(...).toThrow); G9.2/G9.3 only throw on caller-side adapter exception |
✓ |
| AC#12 build && lint && test all pass | §2, §3, §5 | ✓ |
§7. Forbidden patterns audit (final pass)
| Pattern | Result |
|---|---|
Date.now() / Date.UTC / new Date() |
Absent (G10.1) |
Math.random() / crypto.randomBytes |
Absent (G10.1) |
performance.now / process.hrtime |
Absent (G10.1) |
setTimeout / setInterval / setImmediate |
Absent (G10.1) |
fetch( / async function / await |
Absent (G10.1) |
| Float literals | Absent (G10.2) |
Direct import from ../rules/admission.js |
Absent (G11.1) |
Direct import from ../rules/engine.js |
Absent (G11.1) |
Direct import from ../reputation/compute.js |
Absent (G11.1) |
throw statements in detector body |
Absent (visual inspection — only the implicit caller-throw passthrough in deps.admission / deps.engine; this is documented in contract §2.2 and tested in G9.2/G9.3) |
§8. Surprises and follow-ups
§8.1 κ admission shape divergence (HOLD for P4.4.1 escalation)
The task prompt’s mental model — admission: (actor, context) => Action[]
— does NOT map directly to the live κ evaluateAdmission(req, registry)
signature. The latter:
- Takes a SINGLE
AdmissionRequest(caller + tool + mode + rep_snapshot + rule_version), - Returns
AdmissionResult(admit/deny for that ONE tool call), - Does NOT enumerate all admissible tools for an actor.
To produce the available: Action[] set the detector needs, a future wiring
slice will:
- Iterate over κ’s
RuleRegistry.getAll()to find candidate tools. - For each candidate, build an
AdmissionRequest(caller=actor, tool=t, mode=req.mode, ...). - Filter to those with
admitted: true.
This enumeration is expensive (O(n_tools × n_rules)) and should ideally be cached per (actor, context) tuple within a μ scan. P4.4.1 escalation will need to decide whether to compute this lazily (only when a detector asks) or eagerly (during decision-record persistence).
Recommended for P4.4.1 author: add available: readonly Action[] to
the decision-record persistence shape so the detector consumes a stored
projection rather than re-running κ admission on every audit pass.
§8.2 κ engine Outcome shape divergence (HOLD for P4.4.1 escalation)
The κ engine returns TransitionResult { all_mutations: Mutation[], per_category_results }.
The detector’s Outcome { reputation_delta: bigint; obligation_beyond_capacity: boolean }
is a μ-level projection. A future wiring slice will need to:
- Run
executeRuleset(rule, eventRecord, stateRecord, ...)for each action. - Scan
all_mutationsfor reputation-delta mutations (target ='reputations.score'per λ P2.1.1). - Sum them as bigint (BPS).
- Cross-check the resulting (delta, post-state) against λ P2.4.1 obligation tier to set
obligation_beyond_capacity.
This projection lives at the κ→μ adapter layer — NOT in coercion.ts. The
detector’s adapter contract is correct as shipped.
§8.3 λ compute_score signature is asymmetric to scoreCompute
The task prompt’s scoreCompute: (delta, baseline) => bigint does not
match λ’s actual compute_score(node_id, domain, events, ack_lookup, scar_lookup): bigint.
The detector does not call scoreCompute (the engine adapter already
surfaces the delta directly per §3 of the contract — Outcome includes
reputation_delta). The scoreCompute adapter slot is retained for
forward-compatibility but uninvoked. G9.4 verifies this.
Recommended for P4.4.1 author: decide whether to keep scoreCompute
as a no-op symmetry marker or drop it from CoercionDeps when the live
wiring slice lands. If kept, document the use case (likely: a separate
detector for “would this action push the actor below their reputation
floor” which would compose engine.reputation_delta + scoreCompute(delta, baseline)).
§8.4 Type-parameter shape
detectCoercion<TAction, TContext> is fully generic. Tests use
detectCoercion<string, object>(...) since string actions + empty contexts
are the simplest fixture. Real callers will likely use:
TAction = string(tool name) ORTAction = { name: string }(named-with-args)TContext = AdmissionContext(some μ-tier projection of κ’s request shape)
The Map<TAction, Outcome> implementation works for any TAction — JS Maps use SameValueZero for primitive keys and reference equality for objects. The adapter contract requires distinct values for distinct actions, which both shapes satisfy.
§8.5 Recommendation text is in evidence, not in hash preimage
Per P4.1.1 contract §I1, recommendation is metadata orthogonal to
decision_hash. The detector emits a stable decision_hash for a given
option-space regardless of whether the recommendation text changes between
slices (e.g. localization).
This means: a future slice can refine the recommendation wording without
breaking the dedup invariant for already-persisted advisories. The
decision_hash is anchored to (presented_count, available_count,
presented_signatures, available_signatures), nothing else.
§9. Step 5 of 5 complete
Five commits on feature/p4-2-2-coercion-detector (chronological):
audit(p4-2-2-coercion-detector): inventory surface—845684afcontract(p4-2-2-coercion-detector): behavioral contract—ff6a7d0fpacket(p4-2-2-coercion-detector): execution plan—073f3936feat(p4-2-2-coercion-detector): option-set coercion detector—dc02e9dcverify(p4-2-2-coercion-detector): test evidence— TBA (this commit)
Ready for push + PR.