P3.3.2 — Gossip Bloom Filter Dedup — Packet

Step 3 of the 5-step chain. Implementation plan gating Step 4.

§1. File plan

src/domains/consensus/bloom-dedup.ts (~200 LOC)

Structural outline:

1. Header JSDoc — module identity, ADR-003 Option C note, Math.* carve-out note
2. Imports — node:crypto NAMED import (createHash)
3. sizeFilter(n, p) — the Number-vs-bigint exception, isolated
4. class BloomFilter:
   - constructor({n, p}) — calls sizeFilter, allocs Uint8Array(ceil(m/8))
   - private indexFor(i, event_id): number — hash construction
   - insert(event_id): void
   - mightContain(event_id): boolean
   - reset(): void
   - stats(): {m, k, set_bits, occupancy}
5. (no re-exports — gossip-wire integration is via consumer call sites)

src/__tests__/domains/consensus/bloom-dedup.test.ts (~280 LOC)

Mirror the messages.test.ts grouping pattern. 10 describe-groups, ~17 it-blocks total.

Group 1  — sizeFilter math (3 tests)
Group 2  — Constructor allocates correctly (1 test)
Group 3  — No false negatives (1 test, 1000 inserts)
Group 4  — Empirical FPR primary seed (1 test, 1000 + 10000)
Group 5  — Empirical FPR secondary seed (1 test, 1000 + 10000)
Group 6  — reset() clears all bits (1 test)
Group 7  — Hash determinism (2 tests: same instances, order-invariance)
Group 8  — Static scanner — Math.* confined to sizeFilter (1 test)
Group 9  — Banned token scan (1 test)
Group 10 — stats().occupancy correctness (3 tests)

§2. Implementation specifics

2.1 Header JSDoc

The header must include:

  • Path + Phase + Wave designation
  • “Pure module” claim w/ enumerated exclusions
  • ADR-003 Option C in-process spike note (same as gossip-wire.ts)
  • Math.* carve-out explicitly named — confined to sizeFilter
  • Canonical references list:
    • docs/audits/p3-3-2-bloom-dedup-audit.md
    • docs/contracts/p3-3-2-bloom-dedup-contract.md
    • docs/packets/p3-3-2-bloom-dedup-packet.md
    • docs/spec/s08-gossip.md §Bloom filter algorithm + §Deduplication
    • src/domains/consensus/gossip-wire.ts (P3.3.1 — integration seam)
  • Forbidden-token self-scan statement (no Date.*, no float literals OUTSIDE sizeFilter, no Math.random, no async)

2.2 sizeFilter

export function sizeFilter(n: number, p: number): { m: number; k: number } {
  // ONE documented exception: ln() has no bigint stdlib equivalent. This
  // function is the ONLY place in this module where Math.* and floats are
  // permitted. Output integers flow into BloomFilter and never re-enter
  // float arithmetic.
  const m = Math.ceil((-n * Math.log(p)) / Math.log(2) ** 2);
  const k = Math.round((m / n) * Math.log(2));
  return { m, k };
}

Decimal exponent ** 2 is integer exponent of a float and is allowed inside this carve-out. No other Math operations are used.

2.3 BloomFilter constructor

export class BloomFilter {
  private readonly m: number;
  private readonly k: number;
  private readonly bits: Uint8Array;

  constructor({ n, p }: { n: number; p: number }) {
    const sized = sizeFilter(n, p);
    this.m = sized.m;
    this.k = sized.k;
    // ceil(m/8) bytes; >>> 3 is unsigned integer division by 8.
    // For m not a multiple of 8, the extra bits in the last byte are unused.
    const byteLen = (this.m + 7) >>> 3;
    this.bits = new Uint8Array(byteLen);
  }
  // ...
}

(m + 7) >>> 3 is the standard “ceiling-divide by 8” integer trick. No Math.ceil here — keeps Math.* confined to sizeFilter.

2.4 indexFor

private indexFor(i: number, event_id: Buffer): number {
  const seed = Buffer.alloc(4);
  seed.writeUInt32BE(i, 0);
  const h = createHash('sha256').update(seed).update(event_id).digest();
  const u32 = h.readUInt32BE(0);
  return u32 % this.m;
}
  • Buffer.alloc(4) zeroes 4 bytes.
  • writeUInt32BE(i, 0) writes i in big-endian (range [0, 2^32)).
  • createHash('sha256').update(seed).update(event_id).digest() produces a 32-byte Buffer.
  • readUInt32BE(0) reads the first 4 bytes as a uint32 in [0, 2^32).
  • u32 % this.m reduces to [0, m).

No Math used. All integer arithmetic on number (within 32-bit unsigned range, which JS handles exactly).

2.5 insert + mightContain bit ops

insert(event_id: Buffer): void {
  for (let i = 0; i < this.k; i++) {
    const bit = this.indexFor(i, event_id);
    const byteIdx = bit >>> 3;
    const mask = 1 << (bit & 7);
    this.bits[byteIdx] |= mask;
  }
}

mightContain(event_id: Buffer): boolean {
  for (let i = 0; i < this.k; i++) {
    const bit = this.indexFor(i, event_id);
    const byteIdx = bit >>> 3;
    const mask = 1 << (bit & 7);
    if ((this.bits[byteIdx] & mask) === 0) {
      return false;
    }
  }
  return true;
}
  • bit >>> 3 is unsigned integer division by 8 (byte offset within bits).
  • bit & 7 selects bit position within byte (0..7).
  • 1 << (bit & 7) is the bit mask. 1 << 0 = 1, …, 1 << 7 = 128. Safe — no overflow.
  • |= mask sets the bit; & mask reads it.
  • mightContain short-circuits on the first zero bit.

No Math, no floats, all integer ops on number.

2.6 reset + stats

reset(): void {
  this.bits.fill(0);
}

stats(): { m: number; k: number; set_bits: number; occupancy: number } {
  let set_bits = 0;
  for (let i = 0; i < this.bits.length; i++) {
    // Brian Kernighan's bit count
    let b = this.bits[i];
    while (b !== 0) {
      b &= b - 1;
      set_bits++;
    }
  }
  return { m: this.m, k: this.k, set_bits, occupancy: set_bits / this.m };
}

The popcount loop is Brian Kernighan’s classic bit-count: b & (b - 1) clears the lowest set bit. Integer-only.

occupancy: set_bits / this.m is float division — this is acceptable because stats() is a diagnostic function and its output is consumed only by tests and future ξ observability. The float does NOT feed back into the Bloom main path. The / operator does not invoke Math.*.

2.7 Test file plan

import { createHash } from 'node:crypto';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import path from 'node:path';

import { sizeFilter, BloomFilter } from '../../../domains/consensus/bloom-dedup.js';

// Deterministic event_id generator. Counter -> 32-byte SHA-256 Buffer.
function makeEventId(counter: number): Buffer {
  const seed = Buffer.alloc(8);
  seed.writeBigUInt64BE(BigInt(counter), 0);
  return createHash('sha256').update(seed).digest();
}

describe('bloom-dedup', () => {
  describe('Group 1: sizeFilter math', () => {
    test('default n=1000 p=0.01 → m=9586, k=7', () => {
      expect(sizeFilter(1000, 0.01)).toEqual({ m: 9586, k: 7 });
    });
    test('n=100 p=0.01 → m=959, k=7', () => {
      expect(sizeFilter(100, 0.01)).toEqual({ m: 959, k: 7 });
    });
    test('n=10000 p=0.001 returns positive integers', () => {
      const { m, k } = sizeFilter(10000, 0.001);
      expect(m).toBeGreaterThan(0);
      expect(k).toBeGreaterThan(0);
      expect(Number.isInteger(m)).toBe(true);
      expect(Number.isInteger(k)).toBe(true);
    });
  });

  // ... Groups 2–10 ...
});

Group 8 implementation (Math.* confinement scanner):

describe('Group 8: Math.* confined to sizeFilter', () => {
  test('no Math.* outside sizeFilter body', () => {
    const here = fileURLToPath(import.meta.url);
    const srcPath = path.resolve(
      path.dirname(here),
      '../../../domains/consensus/bloom-dedup.ts'
    );
    const source = readFileSync(srcPath, 'utf8');
    // Strip JSDoc /** ... */ blocks and // line comments so token scan
    // doesn't trip on doc references.
    const stripped = source
      .replace(/\/\*[\s\S]*?\*\//g, '')
      .replace(/\/\/.*$/gm, '');
    // Find the sizeFilter function body span.
    const sizeFilterStart = stripped.indexOf('function sizeFilter');
    expect(sizeFilterStart).toBeGreaterThan(0);
    // Walk forward, track brace depth from the first { at-or-after start.
    let i = stripped.indexOf('{', sizeFilterStart);
    let depth = 0;
    let sizeFilterEnd = -1;
    while (i < stripped.length) {
      const ch = stripped[i];
      if (ch === '{') depth++;
      else if (ch === '}') {
        depth--;
        if (depth === 0) {
          sizeFilterEnd = i;
          break;
        }
      }
      i++;
    }
    expect(sizeFilterEnd).toBeGreaterThan(sizeFilterStart);
    const outside =
      stripped.slice(0, sizeFilterStart) + stripped.slice(sizeFilterEnd + 1);
    // Math.* in body outside sizeFilter is banned.
    expect(outside).not.toMatch(/\bMath\./);
  });
});

Group 9 implementation (banned tokens):

describe('Group 9: banned tokens in source body', () => {
  test('no Date / setTimeout / fetch / await / async / Math.random', () => {
    const here = fileURLToPath(import.meta.url);
    const srcPath = path.resolve(
      path.dirname(here),
      '../../../domains/consensus/bloom-dedup.ts'
    );
    const source = readFileSync(srcPath, 'utf8');
    const stripped = source
      .replace(/\/\*[\s\S]*?\*\//g, '')
      .replace(/\/\/.*$/gm, '');
    for (const pattern of [
      /\bDate\./,
      /\bnew Date\b/,
      /\bsetTimeout\b/,
      /\bsetInterval\b/,
      /\bsetImmediate\b/,
      /\bfetch\b/,
      /\bXMLHttpRequest\b/,
      /\bprocess\.hrtime\b/,
      /\bprocess\.nextTick\b/,
      /\basync function\b/,
      /\bawait\b/,
      /\bMath\.random\b/,
    ]) {
      expect(stripped).not.toMatch(pattern);
    }
  });
});

Group 4 implementation (empirical FPR):

describe('Group 4: empirical FPR (primary seed)', () => {
  test('1000 inserts, 10000 disjoint queries → FPR < 1.5%', () => {
    const filter = new BloomFilter({ n: 1000, p: 0.01 });
    for (let i = 0; i < 1000; i++) {
      filter.insert(makeEventId(i));
    }
    let falsePositives = 0;
    for (let i = 1000; i < 11000; i++) {
      if (filter.mightContain(makeEventId(i))) {
        falsePositives++;
      }
    }
    const fpr = falsePositives / 10000;
    expect(fpr).toBeLessThan(0.015);
  });
});

§3. Implementation order

  1. Write src/domains/consensus/bloom-dedup.ts (full file in one pass).
  2. Write src/__tests__/domains/consensus/bloom-dedup.test.ts (full file in one pass).
  3. Run npm run build — TypeScript check.
  4. Run npm test -- --testPathPattern=bloom-dedup — focused test run.
  5. Run npm run lint.
  6. Run full npm run build && npm run lint && npm test — full gate.
  7. Commit Step 4 (feat).
  8. Write docs/verification/p3-3-2-bloom-dedup-verification.md capturing test counts + FPR observed.
  9. Commit Step 5 (verify).

§4. Risk + mitigation review

(Recap from audit §7 plus packet-level mitigations.)

Risk Mitigation in packet
FPR test stochastic flake Deterministic counter-keyed SHA-256 PRNG (no Math.random, no Date.now). Two seeds. 1.5% > 1.0% theoretical leaves wide margin.
Math.* token leak (m + 7) >>> 3 instead of Math.ceil(m/8) in constructor; bit-shifts in insert/mightContain; popcount via Brian Kernighan. Static scanner test enforces.
Modular bias on digest % m Audit §7 documents bound ~10^-6 for m=9586. Below FPR target by 4 orders of magnitude. No mitigation needed.
1 << 7 = 128 overflow Safe — mask range is 1..128, fits in signed int32 trivially.
Source-path resolution in Group 8/9 scanner Use fileURLToPath(import.meta.url) + path.resolve — same pattern as messages.test.ts Group 9 uses (assumes same import.meta.url support, which Jest ESM provides).

§5. Acceptance gate

  • File outlines specified for both source and test
  • sizeFilter signature + body specified verbatim
  • BloomFilter constructor + all methods specified verbatim
  • Math.* confinement strategy specified (bit-shifts + popcount)
  • Hash construction frozen (4-byte BE i prefix, SHA-256, first 4 bytes BE → uint32, mod m)
  • Test groups 1–10 enumerated with sample implementations for the non-trivial ones
  • Implementation order (steps 1–9) listed
  • Risk + mitigation review
  • Step 4 implements
  • Step 5 verifies

Step 3 complete. Proceeding to Step 4 (implement).


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.