Audit — P4.2.2 Coercion Trap Detector (option-set)

§1. Scope

This audit inventories the existing surface that the option-set coercion-trap detector will consume — the κ admission evaluator (P1.4.1), the κ rule engine (P1.3.1), the λ score-compute (P2.1.2), and the μ advisory envelope (P4.1.1).

The detector itself lives at src/domains/integrity/detectors/coercion.ts and its test suite at src/domains/integrity/detectors/__tests__/coercion.test.ts. Both are NEW files — no prior detector ships under src/domains/integrity/ at base SHA 49560518 (verified by ls src/domains/integrity/ returning only schema.ts).

§2. Upstream surfaces — what we depend on

§2.1 P4.1.1 advisory envelope — src/domains/integrity/schema.ts

Public exports we will use:

Symbol Kind Purpose
Advisory TS interface 8-field envelope (z.infer of AdvisorySchema)
AdvisoryRole TS union 'Translator' \| 'Sentinel' \| 'Guide'
AdvisoryCheck TS union includes 'coercion_trap'
AdvisoryResult TS union includes 'WARN'
AdvisorySeverity TS union includes 'HIGH'
computeDecisionHash(role, check, input, result) function SHA-256 over canonical preimage → 64-char hex

Key invariants from P4.1.1 contract §I1, §I2, §I7:

  • decision_hash preimage is role || '||' || check || '||' || canonical(input) || '||' || result. No engine_version mixin. No severity / evidence / timestamp in preimage (dedup invariant).
  • Closed enums are TS-enforced; passing 'COERCION_TRAP' (uppercase) would fail Zod AdvisorySchema.parse.
  • timestamp_logical is bigint, non-negative, caller-supplied (no clock reads in detector source).

Selected role for coercion-trap advisories: 'Sentinel' (μ’s classic detector role per integrity.md §Three advisory roles L117-127 — “patterns that human operators or π governance should respond to”). The Translator role is for cross-surface translation; Guide is for narrative explanations of axiom drift. Sentinel is the right fit for a detector that watches the option-space.

§2.2 κ admission — src/domains/rules/admission.ts (P1.4.1)

The function the detector needs to enumerate available actions: none directly.

evaluateAdmission(req: AdmissionRequest, registry: RuleRegistry): AdmissionResult only returns admit/deny for ONE caller+tool+mode. It does NOT enumerate all admissible actions across the tool-space. This is the critical adapter-injection finding: the detector cannot import a live “enumerate-all-admissible-actions-for-actor-in-context” function from κ because no such function exists today.

Resolution (per task prompt + contract §4 isolation):

The detector accepts an admission: (actor, context) => Action[] adapter via CoercionDeps. In a future μ wiring slice (post-P4.2.2), the adapter implementation would iterate the κ registry, call evaluateAdmission for each (caller=actor, tool=t, mode=mode), and return the admitted tools. P4.2.2 ships only the detector function — wiring is out of scope (mirrors integrity.md L67’s enumerate_available_actions pseudocode that has no concrete impl).

Selected Action shape (detector-local): opaque string OR opaque { name: string } — we’ll keep it as a generic type parameter so the test suite can use whatever shape it wants. The detector treats actions as opaque, using Map<Action, Outcome> with referential identity (since two identical strings === in JS).

§2.3 κ engine — src/domains/rules/engine.ts (P1.3.1)

The function the detector needs to simulate outcomes: none directly that returns { reputation_delta, obligation_beyond_capacity }.

executeRuleset returns a TransitionResult with all_mutations: Mutation[] and per_category_results. The shape is “what mutations would be applied if this rule body ran” — NOT “would the actor’s reputation go down” or “would this exceed their obligation capacity”. The bridge from κ Mutation to μ Outcome is a μ-level concern.

Resolution: the detector accepts engine: (action, context) => Outcome adapter via CoercionDeps. The adapter is responsible for:

  1. Running executeRuleset(...) for the action,
  2. Summing the resulting reputation_delta mutations via λ compute_score (per λ P2.1.2 — bigint BPS),
  3. Computing obligation_beyond_capacity: boolean from the obligation-checking λ slice (P2.4.1).

P4.2.2 ships the detector + a sample adapter test fixture. Real κ→λ wiring is a later slice (estimated post-P4.4.1 escalation FSM).

§2.4 λ compute — src/domains/reputation/compute.ts (P2.1.2)

Surface we depend on:

  • compute_score(node_id, domain, events, ack_lookup, scar_lookup): bigint — bigint folder over history.
  • BPS_100_PERCENT = 10_000n (re-exported via the reputation domain).

Type carryover: reputation_delta IS bigint (the prompt-mandated type — matches λ’s integer-only invariant I1). The detector compares with < 0n, never < 0.

The prompt’s scoreCompute: (delta, baseline) => bigint adapter signature is a simplified projection of compute_score. It lets tests inject deterministic score curves without setting up full ack/scar lookups. P4.2.2 ships the adapter signature; downstream slices will wire it to live λ.

§2.5 P4.1.1 envelope decisionRecord shape — what is it?

The prompt says the detector receives a decisionRecord with .options, .actor, .context. This is NOT a κ type; it’s NOT a μ type; it’s a NEW abstraction for the option-set-aware advisory site.

Proposed local type (lives in coercion.ts):

interface DecisionRecord<TAction, TContext> {
  readonly actor: string;
  readonly context: TContext;
  readonly options: readonly TAction[];  // what the agent saw
}

Generic over TAction (opaque to the detector) + TContext (opaque to the detector, just passed through to the adapters). The Map<Action, Outcome> keyed on referential identity works for any TAction (strings, objects, symbols).

Persistence note: the detector returns advisories; it does not persist decisionRecord directly. A future P4.5.1 schema slice may persist a serialized form, but that’s downstream.

§3. Downstream consumers — what depends on us

  • P4.4.1 escalation FSM — consumes coercion-trap advisories (severity HIGH, result WARN). Maps to HARD BLOCK at α tool-lock per integrity.md L155 “coercion-in-admission rejection”. P4.4.1 reads advisory.severity + advisory.check + advisory.evidence; it does NOT mutate the advisory.
  • P4.5.1 advisory persistence — INSERT-only mcp_advisories table keyed on decision_hash. Identical (role + check + canonical(input) + result) advisories are collapsed to one row. The dedup invariant matters here: a coercion-trap detector that runs twice on the same (presented, available, outcomes) MUST emit the same decision_hash — guaranteed by canonicalization of the input.
  • P4.6.1 MCP tool surfaceintegrity_list_advisories(filter) will surface coercion-trap rows. The tool reads what we persist; it does not touch the detector.
  • P4.7.1 parity harness — runs four mock detectors (one per check value) and asserts byte-identical canonical encodings across runs. We provide coercion_trap; parity sums over our pure-function determinism.

§4. Forbidden patterns audit (what NOT to do)

Per the task FORBIDDENS + the determinism guardrails inherited from P4.1.1 + κ:

Pattern Why forbidden Source
Date.now() / Date.UTC() / new Date() Wall-clock read → non-deterministic dedup hash if we ever include time in preimage; out-of-band signal to caller Task FORBIDDENS; κ determinism.ts §FORBIDDEN_PATTERNS
Math.random() / crypto.randomBytes() RNG read → non-deterministic advisory shape; breaks dedup invariant Task FORBIDDENS
performance.now() / process.hrtime() Wall-clock; same reason as Date.now() κ determinism.ts
setTimeout / setInterval / async / await Async control flow → not pure; would break adapter-injection composability κ determinism.ts
Float literals (< 0 against bigint) Mixed bigint / number comparison is a TS error AND a logic error per λ P2.1.2 I1 λ P2.1.2 contract
Direct import of ../rules/admission.js / ../rules/engine.js Circular module dependency potential; couples μ detector to κ runtime; breaks adapter-injection contract Task spec; integrity.md detector-pattern
Throwing on degenerate input Detector is advisory-ONLY; never blocks. available.length === 0 returns advisory[], not throw Task FORBIDDENS + integrity.md L85

§5. Files we will create

Path Lines (est) Purpose
src/domains/integrity/detectors/coercion.ts ~250–350 Pure function detectCoercion + types
src/domains/integrity/detectors/__tests__/coercion.test.ts ~400–600 Empty / all-negative / all-obligates / mixed / determinism / static-scanner
docs/audits/p4-2-2-coercion-detector-audit.md this file Step 1
docs/contracts/p4-2-2-coercion-detector-contract.md Step 2 Step 2
docs/packets/p4-2-2-coercion-detector-packet.md Step 3 Step 3
docs/verification/p4-2-2-coercion-detector-verification.md Step 5 Step 5

§6. Files we will NOT touch

  • src/domains/integrity/schema.ts (P4.1.1 — frozen surface; we IMPORT but do not MODIFY)
  • src/domains/rules/** (κ — NO direct import; adapter injection only)
  • src/domains/reputation/** (λ — NO direct import; adapter signature only)
  • src/domains/integrity/detectors/circular.ts (P4.2.1 — another agent owns)
  • src/domains/integrity/detectors/drift.ts (P4.2.3 — another agent owns)
  • src/server.ts (no MCP tool registration in this slice; P4.6.1 owns)

§7. Open questions resolved before contract drafting

  • Q: How does Map<Action, Outcome> work if Action is an object literal vs. a string? A: Map uses referential identity for object keys. Strings → value identity. The detector’s adapter contract requires that admission(actor, context) returns DISTINCT JS values for distinct actions (no aliasing). This is documented in the contract §I3.

  • Q: What if presented ⊊ available (κ admission allows MORE than was shown to the agent)? A: The detector flags only on the AVAILABLE set being degenerate. presented is captured in evidence for operator review per integrity.md L73 “presented (what the agent saw) vs available (what κ would have admitted)”. The contract §3.5 documents this.

  • Q: What if presented ⊋ available (the record claims more options than κ would admit)? A: Same — flag on the AVAILABLE set’s degeneracy. The discrepancy is in evidence for the operator. The detector does not double-flag this case as a separate “presented-but-not-actually-available” trap; that belongs to a future P4.2.4-style audit-trail-consistency detector.

  • Q: Is severity: 'HIGH' always correct, or does it vary by trigger? A: Per task spec — always 'HIGH'. integrity.md §2 L80 explicitly says severity=HIGH for coercion-trap. The three trigger conditions (empty / all-negative / all-obligates) share the same severity; only recommendation text varies.

  • Q: Does 'WARN' result block downstream? A: No. Per P4.1.1 schema + integrity.md L85, WARN is “operator console signal”. Only 'BLOCK' is governance-actionable. P4.4.1 escalation FSM decides whether to map a WARN coercion-trap into a HARD BLOCK at α tool-lock per integrity.md L155.

§8. Reuse vs. divergence — what we copy from P4.1.1, what we don’t

Copy:

  • The 8-field envelope shape (Advisory) — verbatim consumption.
  • The decision-hash discipline (computeDecisionHash over a stable input projection).
  • Determinism guardrails (no clock, no RNG, no async, no float literals).
  • The decisionRecord-as-input pattern (mirrors P4.1.1 “input is unknown” pattern at the hash boundary).

Diverge:

  • We do NOT define new enums — we consume AdvisoryRoleSchema etc. from P4.1.1.
  • We do NOT mint Lamport clocks — lamportNow: bigint is a caller-supplied parameter (mirrors θ P3.1.1’s nextLogical() separation).
  • We do NOT canonicalize-then-hash ourselves — computeDecisionHash from P4.1.1 does it.

Step 1 of 5 complete. Step 2: contract.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.