Audit: P0.8.2 η Three-Zone Retention
Task: P0.8.2 — second η (Proof Store) surface in Colibri (retention / archival policy)
Branch: feature/p0-8-2-retention
Worktree: .worktrees/claude/p0-8-2-retention
Date: 2026-04-17
Auditor: T3 Executor (Claude Opus 4.7)
Base commit: dc660381 (origin/main)
1. Surface inventory
1.1 η surface state (before this task)
Exactly one source file in src/domains/proof/:
| File | Task | Exports |
|---|---|---|
src/domains/proof/merkle.ts |
P0.8.1 (#136) | EMPTY_TREE_ROOT, buildMerkleTree, generateProof, verifyProof, types MerkleProof, MerkleProofNode, MerkleTreeResult |
No retention code. No retention.ts. No src/__tests__/domains/proof/retention.test.ts. This task is a greenfield sibling to merkle.ts inside the same domain.
1.2 Related prior art — ζ records that will be archived
The records this task manages belong to the ζ Decision Trail surface, shipped by P0.7.2:
| File | Relevant surface |
|---|---|
src/domains/trail/schema.ts |
ThoughtRecordSchema, ThoughtRecord, computeHash, canonicalize, ZERO_HASH, THOUGHT_TYPES |
src/domains/trail/repository.ts |
createThoughtRecord, getThoughtRecord, listThoughtRecords — CRUD used to populate thought_records |
src/db/migrations/003_thought_records.sql |
Defines thought_records table — 9 columns, UNIQUE(hash), indexes on (task_id, created_at) and (prev_hash) |
The retention module MUST preserve chain integrity — in particular hash, prev_hash, id, type, task_id, timestamp, agent_id, created_at must never be destroyed, because P0.7.3 (audit_verify_chain, not yet shipped) will need to re-hash the 6-field subset to validate the chain. Only content + derived content_compressed are mutable.
1.3 Position-vs-age interpretation
Task spec (task-breakdown.md §P0.8.2):
- Hot: last 100 records — full content in DB
- Warm: records 101–1000 — content compressed (JSON → gzip → base64)
- Cold: records 1001+ — content hash only (full content deleted)
Donor extraction (docs/reference/extractions/eta-proof-store-extraction.md §”Retention Zones”):
- Hot: 7 days, Warm: 30 days, Cold: 365 days — TTL-based (time-based zones).
Mismatch resolved in favour of the task spec. The task-breakdown is authoritative; the donor extraction is HERITAGE (quarantine-tagged). Colibri P0.8.2 uses position-based zones ordered by rowid ASC per task_id:
- Position 1..100 (newest first by
rowid DESC) → Hot - Position 101..1000 → Warm
- Position 1001+ → Cold
rowid (not created_at) — per P0.7.2 lesson: millisecond-precision timestamps collide on fast CI. ORDER BY rowid DESC LIMIT 100 OFFSET 0 gives Hot, OFFSET 100 gives Warm, OFFSET 1000 gives Cold. We compute a record’s zone from its rowid rank within the same task_id chain.
This interpretation is documented in the contract for future clarity.
1.4 Test infrastructure state
src/__tests__/domains/proof/merkle.test.ts exists (P0.8.1). Pure unit test — no DB, no MCP. This task’s tests DO need a DB (we archive SQLite rows), so the template follows src/__tests__/domains/trail/repository.test.ts:
import Database from 'better-sqlite3'— in-memory- Migration SQL loaded once at module scope, exec’d into fresh DB per test via
beforeEach afterEachcloses handle
No test-path correction needed beyond the one in the task prompt (src/__tests__/domains/proof/retention.test.ts, NOT tests/domains/proof/...).
1.5 Existing audit/contract/packet docs for P0.8.2
None — confirmed via ls docs/audits/. Only P0.8.1 artifacts exist:
p0-8-1-merkle-tree-audit.md
p0-8-1-merkle-tree-contract.md
p0-8-1-merkle-tree-packet.md
This document is the first P0.8.2 artefact.
1.6 Migration number
Next available migration number: 005. Confirmed via ls src/db/migrations/:
001_init.sql
002_tasks.sql
003_thought_records.sql
004_skills.sql
This task ships 005_retention.sql.
1.7 No collision with parallel tasks
- P0.8.3 (
src/tools/merkle.ts) — different file, disjoint concern. - P0.9.x (
src/domains/integrations/) — different directory. - P0.8.1 (
src/domains/proof/merkle.ts) — CONSUMED, not modified. No edit tomerkle.ts.
2. Files to create
| Path | Purpose |
|---|---|
docs/audits/p0-8-2-retention-audit.md |
THIS file |
docs/contracts/p0-8-2-retention-contract.md |
Behavioral contract (Zod schemas, invariants, acceptance map) |
docs/packets/p0-8-2-retention-packet.md |
Implementation plan |
docs/verification/p0-8-2-retention-verification.md |
Test + lint evidence |
src/db/migrations/005_retention.sql |
New columns on thought_records: zone TEXT, content_compressed TEXT, content_hash TEXT |
src/domains/proof/retention.ts |
archiveRecord, retrieveRecord, computeZone, types, zod schemas |
src/__tests__/domains/proof/retention.test.ts |
Acceptance-criteria-aligned tests |
3. Files to modify
None. No edits to existing source files. The migration adds new columns but does not rewrite thought_records.sql or any existing .ts file.
4. Schema changes
Add three nullable columns to thought_records via 005_retention.sql:
| Column | Type | Default | Meaning |
|---|---|---|---|
zone |
TEXT | 'hot' |
'hot' \| 'warm' \| 'cold'. NULL is treated as 'hot' in reads (grace period for legacy rows pre-migration). Fresh writes default to 'hot'. |
content_compressed |
TEXT | NULL |
Base64-encoded gzip of JSON(record). Populated only in Warm. NULL otherwise. |
content_hash |
TEXT | NULL |
SHA-256 (lowercase hex, 64 chars) of the original content string. Populated when transitioning to Warm or Cold — preserved in Cold even after content + content_compressed become NULL. |
Rationale for three columns:
zone— primary identifier of retention state, indexed for any future “list by zone” operation.content_compressed— compressed payload only used in Warm; nullable because Hot + Cold don’t use it.content_hash— sha256 of the original content, preserves content-level provability after the row’scontentis deleted. (Note: distinct from the record’shashcolumn, which hashes the 6-field subset including content — so the chain hash already commits to content.content_hashis a convenience for cold-zone consumers and matches the task spec “content hash only”.)
No index is added on zone in this migration — archival operations key by id or by task_id, both already indexed. A future task may add idx_trail_zone if retention-pass queries warrant it.
The existing CHECK-less type column pattern is followed here: zone is validated at the Zod/repository layer, not at the DB.
5. Known hazards (tracked for packet / verify)
- Cross-worktree leak —
git statusat start returned clean tree. Re-verify before each commit. - SQLite rowid ordering — use
ORDER BY rowid ASC/DESC; NEVERcreated_at. Position is 1-indexed among records of the sametask_id, newest = position 1 (so Hot = positions 1..100). - Jest + zod — do NOT use
jest.isolateModulesAsync. Tests use in-memory DB with real migration SQL (the P0.7.2 pattern). - Idempotency — archiveRecord called twice on a record already in the target zone must no-op.
- Hash preservation — never NULL out
hash,prev_hash, or the 6 subset-hash fields. Onlycontent+content_compressedare touched. - Migration — adding nullable columns via
ALTER TABLEis safe; no data rewrite needed. Existing rows getzone=NULL, content_compressed=NULL, content_hash=NULL, which the repository treats as “hot / not yet archived”. better-sqlite3.exec('')throws on empty SQL. The 005 migration has real SQL (the 3 ALTERs), so thestripSqlCommentsempty-path insrc/db/index.tsdoesn’t apply.
6. Consumption plan
archiveRecord(db, id) and retrieveRecord(db, id) are exported from src/domains/proof/retention.ts. No MCP tool is registered by this task — it’s a library surface. A future P0.8.x task or the writeback machinery may wrap these into MCP tools, but that is out-of-scope here (task spec registers no tool).
Pure functions beside the two primitives:
computeZone(position: number): 'hot' | 'warm' | 'cold'— decides the target zone given a 1-indexed position in the per-task chain.getRecordPosition(db, id): number | null— returns the record’s position within its task chain, ornullif the id is unknown.gzipContent(content: string): string/gunzipContent(compressed: string): string— pure compression helpers.hashContent(content: string): string— SHA-256 hex of the content string (distinct from the chain hash).
These helpers are exported so tests can exercise the pieces independently without needing 1000+ record fixtures.
7. Out of scope
- No MCP tool registration (that’s a later P0.8.x task, not in task-breakdown for P0.8.2).
- No scheduled retention pass / cron —
archiveRecordis called explicitly by a caller that iterates records. - No index on
zonecolumn (defer to a later task if needed). - No
audit_verify_chainmodifications to handle Cold records (P0.7.3 will handle the missing-content case when it lands). - No unarchive / unzone operation. Forward-only state machine per contract §5.
8. Final checklist (audit gate → proceed to contract)
- Surface inventory complete
- Position-vs-age interpretation documented + chosen
- Files to create + modify listed
- Migration number confirmed (005)
- Schema changes enumerated (3 new columns on
thought_records) - Hazards tracked
- Out-of-scope fenced