P3.3.2 — Gossip Bloom Filter Dedup — Packet
Step 3 of the 5-step chain. Implementation plan gating Step 4.
§1. File plan
src/domains/consensus/bloom-dedup.ts (~200 LOC)
Structural outline:
1. Header JSDoc — module identity, ADR-003 Option C note, Math.* carve-out note
2. Imports — node:crypto NAMED import (createHash)
3. sizeFilter(n, p) — the Number-vs-bigint exception, isolated
4. class BloomFilter:
- constructor({n, p}) — calls sizeFilter, allocs Uint8Array(ceil(m/8))
- private indexFor(i, event_id): number — hash construction
- insert(event_id): void
- mightContain(event_id): boolean
- reset(): void
- stats(): {m, k, set_bits, occupancy}
5. (no re-exports — gossip-wire integration is via consumer call sites)
src/__tests__/domains/consensus/bloom-dedup.test.ts (~280 LOC)
Mirror the messages.test.ts grouping pattern. 10 describe-groups, ~17 it-blocks total.
Group 1 — sizeFilter math (3 tests)
Group 2 — Constructor allocates correctly (1 test)
Group 3 — No false negatives (1 test, 1000 inserts)
Group 4 — Empirical FPR primary seed (1 test, 1000 + 10000)
Group 5 — Empirical FPR secondary seed (1 test, 1000 + 10000)
Group 6 — reset() clears all bits (1 test)
Group 7 — Hash determinism (2 tests: same instances, order-invariance)
Group 8 — Static scanner — Math.* confined to sizeFilter (1 test)
Group 9 — Banned token scan (1 test)
Group 10 — stats().occupancy correctness (3 tests)
§2. Implementation specifics
2.1 Header JSDoc
The header must include:
- Path + Phase + Wave designation
- “Pure module” claim w/ enumerated exclusions
- ADR-003 Option C in-process spike note (same as gossip-wire.ts)
- Math.* carve-out explicitly named — confined to
sizeFilter - Canonical references list:
docs/audits/p3-3-2-bloom-dedup-audit.mddocs/contracts/p3-3-2-bloom-dedup-contract.mddocs/packets/p3-3-2-bloom-dedup-packet.mddocs/spec/s08-gossip.md §Bloom filter algorithm + §Deduplicationsrc/domains/consensus/gossip-wire.ts(P3.3.1 — integration seam)
- Forbidden-token self-scan statement (no Date.*, no float literals OUTSIDE sizeFilter, no Math.random, no async)
2.2 sizeFilter
export function sizeFilter(n: number, p: number): { m: number; k: number } {
// ONE documented exception: ln() has no bigint stdlib equivalent. This
// function is the ONLY place in this module where Math.* and floats are
// permitted. Output integers flow into BloomFilter and never re-enter
// float arithmetic.
const m = Math.ceil((-n * Math.log(p)) / Math.log(2) ** 2);
const k = Math.round((m / n) * Math.log(2));
return { m, k };
}
Decimal exponent ** 2 is integer exponent of a float and is allowed inside this carve-out. No other Math operations are used.
2.3 BloomFilter constructor
export class BloomFilter {
private readonly m: number;
private readonly k: number;
private readonly bits: Uint8Array;
constructor({ n, p }: { n: number; p: number }) {
const sized = sizeFilter(n, p);
this.m = sized.m;
this.k = sized.k;
// ceil(m/8) bytes; >>> 3 is unsigned integer division by 8.
// For m not a multiple of 8, the extra bits in the last byte are unused.
const byteLen = (this.m + 7) >>> 3;
this.bits = new Uint8Array(byteLen);
}
// ...
}
(m + 7) >>> 3 is the standard “ceiling-divide by 8” integer trick. No Math.ceil here — keeps Math.* confined to sizeFilter.
2.4 indexFor
private indexFor(i: number, event_id: Buffer): number {
const seed = Buffer.alloc(4);
seed.writeUInt32BE(i, 0);
const h = createHash('sha256').update(seed).update(event_id).digest();
const u32 = h.readUInt32BE(0);
return u32 % this.m;
}
Buffer.alloc(4)zeroes 4 bytes.writeUInt32BE(i, 0)writesiin big-endian (range[0, 2^32)).createHash('sha256').update(seed).update(event_id).digest()produces a 32-byte Buffer.readUInt32BE(0)reads the first 4 bytes as a uint32 in[0, 2^32).u32 % this.mreduces to[0, m).
No Math used. All integer arithmetic on number (within 32-bit unsigned range, which JS handles exactly).
2.5 insert + mightContain bit ops
insert(event_id: Buffer): void {
for (let i = 0; i < this.k; i++) {
const bit = this.indexFor(i, event_id);
const byteIdx = bit >>> 3;
const mask = 1 << (bit & 7);
this.bits[byteIdx] |= mask;
}
}
mightContain(event_id: Buffer): boolean {
for (let i = 0; i < this.k; i++) {
const bit = this.indexFor(i, event_id);
const byteIdx = bit >>> 3;
const mask = 1 << (bit & 7);
if ((this.bits[byteIdx] & mask) === 0) {
return false;
}
}
return true;
}
bit >>> 3is unsigned integer division by 8 (byte offset withinbits).bit & 7selects bit position within byte (0..7).1 << (bit & 7)is the bit mask.1 << 0 = 1, …,1 << 7 = 128. Safe — no overflow.|= masksets the bit;& maskreads it.mightContainshort-circuits on the first zero bit.
No Math, no floats, all integer ops on number.
2.6 reset + stats
reset(): void {
this.bits.fill(0);
}
stats(): { m: number; k: number; set_bits: number; occupancy: number } {
let set_bits = 0;
for (let i = 0; i < this.bits.length; i++) {
// Brian Kernighan's bit count
let b = this.bits[i];
while (b !== 0) {
b &= b - 1;
set_bits++;
}
}
return { m: this.m, k: this.k, set_bits, occupancy: set_bits / this.m };
}
The popcount loop is Brian Kernighan’s classic bit-count: b & (b - 1) clears the lowest set bit. Integer-only.
occupancy: set_bits / this.m is float division — this is acceptable because stats() is a diagnostic function and its output is consumed only by tests and future ξ observability. The float does NOT feed back into the Bloom main path. The / operator does not invoke Math.*.
2.7 Test file plan
import { createHash } from 'node:crypto';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import path from 'node:path';
import { sizeFilter, BloomFilter } from '../../../domains/consensus/bloom-dedup.js';
// Deterministic event_id generator. Counter -> 32-byte SHA-256 Buffer.
function makeEventId(counter: number): Buffer {
const seed = Buffer.alloc(8);
seed.writeBigUInt64BE(BigInt(counter), 0);
return createHash('sha256').update(seed).digest();
}
describe('bloom-dedup', () => {
describe('Group 1: sizeFilter math', () => {
test('default n=1000 p=0.01 → m=9586, k=7', () => {
expect(sizeFilter(1000, 0.01)).toEqual({ m: 9586, k: 7 });
});
test('n=100 p=0.01 → m=959, k=7', () => {
expect(sizeFilter(100, 0.01)).toEqual({ m: 959, k: 7 });
});
test('n=10000 p=0.001 returns positive integers', () => {
const { m, k } = sizeFilter(10000, 0.001);
expect(m).toBeGreaterThan(0);
expect(k).toBeGreaterThan(0);
expect(Number.isInteger(m)).toBe(true);
expect(Number.isInteger(k)).toBe(true);
});
});
// ... Groups 2–10 ...
});
Group 8 implementation (Math.* confinement scanner):
describe('Group 8: Math.* confined to sizeFilter', () => {
test('no Math.* outside sizeFilter body', () => {
const here = fileURLToPath(import.meta.url);
const srcPath = path.resolve(
path.dirname(here),
'../../../domains/consensus/bloom-dedup.ts'
);
const source = readFileSync(srcPath, 'utf8');
// Strip JSDoc /** ... */ blocks and // line comments so token scan
// doesn't trip on doc references.
const stripped = source
.replace(/\/\*[\s\S]*?\*\//g, '')
.replace(/\/\/.*$/gm, '');
// Find the sizeFilter function body span.
const sizeFilterStart = stripped.indexOf('function sizeFilter');
expect(sizeFilterStart).toBeGreaterThan(0);
// Walk forward, track brace depth from the first { at-or-after start.
let i = stripped.indexOf('{', sizeFilterStart);
let depth = 0;
let sizeFilterEnd = -1;
while (i < stripped.length) {
const ch = stripped[i];
if (ch === '{') depth++;
else if (ch === '}') {
depth--;
if (depth === 0) {
sizeFilterEnd = i;
break;
}
}
i++;
}
expect(sizeFilterEnd).toBeGreaterThan(sizeFilterStart);
const outside =
stripped.slice(0, sizeFilterStart) + stripped.slice(sizeFilterEnd + 1);
// Math.* in body outside sizeFilter is banned.
expect(outside).not.toMatch(/\bMath\./);
});
});
Group 9 implementation (banned tokens):
describe('Group 9: banned tokens in source body', () => {
test('no Date / setTimeout / fetch / await / async / Math.random', () => {
const here = fileURLToPath(import.meta.url);
const srcPath = path.resolve(
path.dirname(here),
'../../../domains/consensus/bloom-dedup.ts'
);
const source = readFileSync(srcPath, 'utf8');
const stripped = source
.replace(/\/\*[\s\S]*?\*\//g, '')
.replace(/\/\/.*$/gm, '');
for (const pattern of [
/\bDate\./,
/\bnew Date\b/,
/\bsetTimeout\b/,
/\bsetInterval\b/,
/\bsetImmediate\b/,
/\bfetch\b/,
/\bXMLHttpRequest\b/,
/\bprocess\.hrtime\b/,
/\bprocess\.nextTick\b/,
/\basync function\b/,
/\bawait\b/,
/\bMath\.random\b/,
]) {
expect(stripped).not.toMatch(pattern);
}
});
});
Group 4 implementation (empirical FPR):
describe('Group 4: empirical FPR (primary seed)', () => {
test('1000 inserts, 10000 disjoint queries → FPR < 1.5%', () => {
const filter = new BloomFilter({ n: 1000, p: 0.01 });
for (let i = 0; i < 1000; i++) {
filter.insert(makeEventId(i));
}
let falsePositives = 0;
for (let i = 1000; i < 11000; i++) {
if (filter.mightContain(makeEventId(i))) {
falsePositives++;
}
}
const fpr = falsePositives / 10000;
expect(fpr).toBeLessThan(0.015);
});
});
§3. Implementation order
- Write
src/domains/consensus/bloom-dedup.ts(full file in one pass). - Write
src/__tests__/domains/consensus/bloom-dedup.test.ts(full file in one pass). - Run
npm run build— TypeScript check. - Run
npm test -- --testPathPattern=bloom-dedup— focused test run. - Run
npm run lint. - Run full
npm run build && npm run lint && npm test— full gate. - Commit Step 4 (feat).
- Write
docs/verification/p3-3-2-bloom-dedup-verification.mdcapturing test counts + FPR observed. - Commit Step 5 (verify).
§4. Risk + mitigation review
(Recap from audit §7 plus packet-level mitigations.)
| Risk | Mitigation in packet |
|---|---|
| FPR test stochastic flake | Deterministic counter-keyed SHA-256 PRNG (no Math.random, no Date.now). Two seeds. 1.5% > 1.0% theoretical leaves wide margin. |
| Math.* token leak | (m + 7) >>> 3 instead of Math.ceil(m/8) in constructor; bit-shifts in insert/mightContain; popcount via Brian Kernighan. Static scanner test enforces. |
Modular bias on digest % m |
Audit §7 documents bound ~10^-6 for m=9586. Below FPR target by 4 orders of magnitude. No mitigation needed. |
1 << 7 = 128 overflow |
Safe — mask range is 1..128, fits in signed int32 trivially. |
| Source-path resolution in Group 8/9 scanner | Use fileURLToPath(import.meta.url) + path.resolve — same pattern as messages.test.ts Group 9 uses (assumes same import.meta.url support, which Jest ESM provides). |
§5. Acceptance gate
- File outlines specified for both source and test
- sizeFilter signature + body specified verbatim
- BloomFilter constructor + all methods specified verbatim
- Math.* confinement strategy specified (bit-shifts + popcount)
- Hash construction frozen (4-byte BE i prefix, SHA-256, first 4 bytes BE → uint32, mod m)
- Test groups 1–10 enumerated with sample implementations for the non-trivial ones
- Implementation order (steps 1–9) listed
- Risk + mitigation review
- Step 4 implements
- Step 5 verifies
Step 3 complete. Proceeding to Step 4 (implement).