P3.3.2 — Gossip Bloom Filter Dedup — Contract
Step 2 of the 5-step chain. Behavioral contract for src/domains/consensus/bloom-dedup.ts.
§1. Module identity
- Path:
src/domains/consensus/bloom-dedup.ts - Domain: θ Consensus (Phase 3, R89 Wave 4)
- Depends on:
node:crypto(createHash) — NAMED import. No other module imports. - Pure module: no I/O, no DB, no network, no async, no wall-clock. No randomness from the module body — caller supplies event_id Buffers and (in tests) the seed for synthetic event_ids.
- State: the
BloomFilterclass holds mutable per-instance state (bits: Uint8Array). Each instance is per-round and discarded at end-of-round. No module-level mutable state.
§2. Function exports
2.1 sizeFilter(n: number, p: number): { m: number, k: number }
Returns the integer bit count m and integer hash count k for a Bloom filter sized to hold n expected items at false-positive rate p.
Math:
m = Math.ceil(-n * Math.log(p) / (Math.log(2) ** 2))
k = Math.round((m / n) * Math.log(2))
Contract:
nMUST be a positive integer≥ 1. Behavior forn ≤ 0or non-integernis unspecified (no validation throw — caller’s contract).pMUST be in(0, 1).p ≥ 1producesm ≤ 0which then clamps toMath.ceilof a non-positive value — caller’s contract.- The return values are always integers (
Math.ceil/Math.roundenforce this).
Number-vs-bigint exception: the entire θ phase enforces bigint-only arithmetic outside of explicit carve-outs. sizeFilter is the ONE documented carve-out in this module because Math.log has no bigint stdlib equivalent and adding a polyfill for one helper is worse than documenting the carve-out. The returned m and k are integers and flow into the BloomFilter constructor as integer numbers.
Pure function — same inputs always produce the same output. No state.
Test:
sizeFilter(1000, 0.01)returns{ m: 9586, k: 7 }(Math.ceil(9585.058…) = 9586). The audit cited “m ≈ 9585” from the spec; the actual ceiled value rounds up to 9586. The test asserts the precise computed values, not the spec’s approximation.
2.2 class BloomFilter
Constructor
new BloomFilter({ n: number, p: number })
Builds a fresh filter sized via sizeFilter(n, p). Allocates a zero-initialized Uint8Array of length Math.ceil(m / 8) bytes.
Defaults: the constructor does NOT supply defaults. The caller passes n and p explicitly. The default of n=1000, p=0.01 is a doc-level convention, not an in-code default.
Properties (private, no public getters except via stats()):
private readonly m: number— bit count (fromsizeFilter)private readonly k: number— hash count (fromsizeFilter)private readonly bits: Uint8Array—ceil(m/8)bytes
insert(event_id: Buffer): void
For each i in [0, k), compute bit_index = indexFor(i, event_id) and set bit bit_index in bits. Idempotent: inserting the same event_id twice has no observable effect beyond the first insert.
Pure side-effect: mutates this.bits only. No return value. No exceptions for valid Buffer input.
mightContain(event_id: Buffer): boolean
For each i in [0, k), compute bit_index = indexFor(i, event_id) and check bit bit_index in bits. Returns true iff all k bits are set.
No false negatives: if insert(x) was called, mightContain(x) returns true. (This is Bloom’s correctness guarantee.)
False positives bounded by p: when fewer than n items have been inserted, mightContain(y) for y ≠ any inserted x returns true with probability ≤ p. Empirically verified by test §5.4.
Pure function w.r.t. external state — reads this.bits only.
reset(): void
Zeroes the bit array. After reset(), mightContain(x) returns false for all x (until subsequent inserts).
The class’s m and k are unchanged — the filter remains sized for the same (n, p) config. Use reset() between gossip rounds rather than constructing a new filter; same allocation reused.
Side-effect: mutates this.bits (zeros all bytes).
stats(): { m: number, k: number, set_bits: number, occupancy: number }
Returns introspection data:
m— bit count (same as constructor)k— hash countset_bits— number of bits currently set inbitsoccupancy—set_bits / m, in[0, 1]
Used by tests and by future ξ observability surfaces. occupancy is a float (Number) — this is acceptable in a return value for diagnostic purposes; it does not feed back into the filter’s main path.
Pure function w.r.t. external state.
§3. Internal helper
3.1 indexFor(i: number, event_id: Buffer): number
Returns a bit index in [0, m) for hash function i and item event_id.
Algorithm:
const seed = Buffer.alloc(4);
seed.writeUInt32BE(i, 0);
const h = createHash('sha256').update(seed).update(event_id).digest();
const u32 = h.readUInt32BE(0);
return u32 % m;
- 4-byte big-endian encoding of
i(range[0, k)wherek ≤ 255in practice; 4 bytes leaves headroom). - SHA-256 over
seed || event_id— deterministic, no randomness. - First 4 bytes of the 32-byte digest read as big-endian uint32.
- Reduction
mod m— modular bias is bounded by(2^32 mod m) / 2^32≈10^-6form = 9586, far below the FPR target.
indexFor is a private method on BloomFilter. It reads this.m and is NOT exported.
§4. Type exports
The module exports:
sizeFilter— named function exportBloomFilter— named class export
No other types or values are exported. The constructor parameter type { n: number, p: number } is inline (no named interface).
§5. Test contract
The test file src/__tests__/domains/consensus/bloom-dedup.test.ts MUST cover:
5.1 sizeFilter math
sizeFilter(1000, 0.01)returnsm = 9586andk = 7sizeFilter(100, 0.01)returnsm = 959andk = 7sizeFilter(10000, 0.001)returns positive integermandk(sanity)
5.2 Constructor allocates correct size
new BloomFilter({n: 1000, p: 0.01}).stats().m === 9586new BloomFilter({n: 1000, p: 0.01}).stats().k === 7
5.3 No false negatives
- Insert 1000 distinct Buffers
- For each, assert
mightContain(x) === true
5.4 Empirical FPR < 1.5%
- Insert 1000 distinct Buffers (deterministic seed: counters 0–999)
- Query 10000 distinct Buffers NOT in the insert set (counters 1000–10999)
- Count
mightContaintrue-returns - Assert observed FPR < 0.015
5.5 FPR robustness with second seed
- Repeat §5.4 with a different seed offset (counters 100000–100999 inserts, 101000–110999 queries)
- Assert observed FPR < 0.015
5.6 reset() clears all bits
- Insert 1000 Buffers
- Assert
stats().set_bits > 0 - Call
reset() - Assert
stats().set_bits === 0 - For all 1000 originals, assert
mightContain(x) === false
5.7 Hash determinism
- Two
BloomFilterinstances with same(n, p)and same insert sequence MUST produce identicalbitsarrays. - Insertion order MUST NOT matter — inserting
[a, b, c]and[c, a, b]produces identicalbits.
5.8 Static scanner — Math.* confined to sizeFilter
- Read
bloom-dedup.tssource - Locate the
sizeFilterfunction boundary - Assert all
Math.*occurrences are inside thesizeFilterbody - Assert no
Math.random,Math.floor,Math.abs, etc. outsidesizeFilter
5.9 Banned token scan
Mirrors messages.test.ts Group 9. Scans the source body (with JSDoc stripped) for:
Date./new DatesetTimeout/setInterval/setImmediatefetch/XMLHttpRequestprocess.hrtime/process.nextTickawait/async functionMath.random
All MUST be absent.
5.10 stats().occupancy correctness
- Fresh filter:
occupancy === 0 - After 100 inserts at n=1000: occupancy roughly
k * 100 / m ≈ 7 * 100 / 9586 ≈ 0.073(test loose bound 0.05 < occupancy < 0.1 to allow for hash collisions) - After 1000 inserts:
occupancy < 1.0(always — Bloom doesn’t fill completely)
§6. Reuse + integration
- Reuse:
node:crypto.createHash('sha256')— same import style as messages.ts. - Integration: gossip-wire.ts’s
BuildIWANTOptions.seen?is the consumer seam. No modification to gossip-wire.ts required in this slice.
Example call site (NOT in this slice — illustrative for the reviewer):
const filter = new BloomFilter({ n: 1000, p: 0.01 });
// at start of each gossip round; receiver inserts event_ids it has already seen
for (const id of recentlySeen) filter.insert(id);
// when receiving an IHAVE:
const iwant = buildIWANT(ihave, localStorage.has, {
seen: (id) => filter.mightContain(id),
sender_id: 'arbiter-N',
timestamp_logical: nextLogical(lamport),
});
// at end of round:
filter.reset(); // or discard the filter
§7. Non-goals
This slice is deliberately scoped narrow:
- ❌ No persistence — filters are in-memory only
- ❌ No serialization wire format — filters are local to one arbiter’s round state
- ❌ No multi-round filter sharing — fresh filter per round per s08 §67
- ❌ No counting Bloom (delete support) — standard Bloom has no remove; reset() is the only clear
- ❌ No size-doubling on overflow — filter is fixed at construction
- ❌ No gossip-wire.ts modification — the consumer wiring is downstream
§8. Acceptance gate
- Module identity declared (path, deps, purity)
sizeFiltersignature + math + Number-vs-bigint carve-outBloomFilterclass surface (ctor, insert, mightContain, reset, stats)indexForprivate helper algorithm- Test contract enumerates 10 test groups
- Integration seam to gossip-wire.ts documented
- Non-goals listed
- Step 3 packet sequences the implementation
- Step 4 implements both files
- Step 5 verifies all 10 test groups pass + build + lint
Step 2 complete. Proceeding to Step 3 (packet).