Contract: P0.8.2 η Three-Zone Retention
Task: P0.8.2
Branch: feature/p0-8-2-retention
Date: 2026-04-17
Depends on: P0.8.1 (#136), P0.2.2 (Wave B)
1. Purpose
Ship a forward-only three-zone retention policy over ζ thought_records:
- Hot — newest 100 records per task chain, full content in DB.
- Warm — positions 101–1000, content gzip+base64-compressed into
content_compressed; plaincontentNULLed. - Cold — positions 1001+,
content_hashpreserved, bothcontentandcontent_compressedNULLed.
Provides two operations on individual records (archiveRecord, retrieveRecord) plus pure helpers. Chain-integrity hashes (id, type, task_id, timestamp, prev_hash, hash) are never modified — only the mutable content payload is rearranged per zone.
2. Public surface (src/domains/proof/retention.ts)
2.1 Types
export type RetentionZone = 'hot' | 'warm' | 'cold';
/**
* Payload returned by `retrieveRecord`. Exactly one of the three shapes
* is returned based on the record's current zone.
*/
export type RetrieveResult =
| {
id: string;
zone: 'hot';
content: string;
content_available: true;
}
| {
id: string;
zone: 'warm';
content: string; // decompressed from content_compressed
content_available: true;
}
| {
id: string;
zone: 'cold';
hash_only: string; // 64-char sha256 hex of original content
content_available: false;
};
2.2 Zod schemas
export const RETENTION_ZONES = ['hot', 'warm', 'cold'] as const;
export const RetentionZoneSchema = z.enum(RETENTION_ZONES);
// Hot: [1..100] per task_id (newest = position 1)
// Warm: [101..1000]
// Cold: [1001..)
export const HOT_MAX_POSITION = 100;
export const WARM_MAX_POSITION = 1000;
export const ArchiveRecordInputSchema = z.object({
id: z.string().min(1),
});
export const RetrieveRecordInputSchema = z.object({
id: z.string().min(1),
});
2.3 Exports
| Name | Signature | Purpose |
|---|---|---|
RETENTION_ZONES |
readonly ['hot','warm','cold'] |
Canonical zone tuple. |
HOT_MAX_POSITION |
100 |
Boundary constant. |
WARM_MAX_POSITION |
1000 |
Boundary constant. |
RetentionZone |
type | 'hot' \| 'warm' \| 'cold'. |
RetrieveResult |
type | Discriminated union on zone. |
RetentionZoneSchema |
z.enum(...) |
Zod validator for zone strings. |
computeZone(position) |
(position: number) => RetentionZone |
Pure: 1..100 → hot; 101..1000 → warm; 1001+ → cold; throws on position < 1. |
getRecordPosition(db, id) |
(db, id: string) => number \| null |
Position (1-indexed, newest = 1) within same task_id ordered by rowid DESC. null if record id unknown. |
hashContent(content) |
(content: string) => string |
SHA-256 lowercase hex of content bytes. 64 chars. |
gzipContent(content) |
(content: string) => string |
gzipSync then base64-encode. |
gunzipContent(compressed) |
(compressed: string) => string |
base64-decode then gunzipSync. |
archiveRecord(db, id) |
(db, id: string) => { id, from_zone, to_zone, changed: boolean } |
Transition record to its computed zone. Idempotent. |
retrieveRecord(db, id) |
(db, id: string) => RetrieveResult \| null |
Zone-aware read. null if id unknown. |
Each db parameter is a Database.Database from better-sqlite3; functions take it explicitly (dependency injection) so tests can use :memory: databases. This mirrors the P0.7.2 repository style.
3. Storage contract
3.1 Schema delta (005_retention.sql)
ALTER TABLE thought_records ADD COLUMN zone TEXT;
ALTER TABLE thought_records ADD COLUMN content_compressed TEXT;
ALTER TABLE thought_records ADD COLUMN content_hash TEXT;
All three are NULL-able. Existing rows get NULL in all three columns — the repository treats zone IS NULL as “hot / not yet archived”.
No index added on zone in this migration (audit §4 note).
3.2 Per-zone row invariants
| State | zone |
content |
content_compressed |
content_hash |
|---|---|---|---|---|
| Fresh write (P0.7.2) | NULL | original string | NULL | NULL |
| After archiveRecord → hot | 'hot' |
original string | NULL | NULL |
| After archiveRecord → warm | 'warm' |
NULL | base64(gzip(original)) | sha256(original) |
| After archiveRecord → cold | 'cold' |
NULL | NULL | sha256(original) |
content_hash once set is immutable. It exists from the moment the record transitions away from Hot, and remains once in Cold.
3.3 Read-path behaviour
retrieveRecord(db, id):
- If
zone IN (NULL, 'hot'): return{ zone: 'hot', content, content_available: true }. - If
zone = 'warm': gunzipcontent_compressed, return{ zone: 'warm', content: decompressed, content_available: true }. Ifcontent_compressed IS NULLwhilezone='warm', throwRetentionIntegrityError— the table is inconsistent. - If
zone = 'cold': return{ zone: 'cold', hash_only: content_hash, content_available: false }. Ifcontent_hash IS NULLwhilezone='cold', throwRetentionIntegrityError. - If record id unknown: return
null.
4. Error classes
| Error | When |
|---|---|
ZodError |
Invalid input (empty id). |
RetentionIntegrityError (new class exported) |
Zone column is set but required companion column (content_compressed for warm, content_hash for cold) is NULL. |
Error("Record not found: <id>") |
archiveRecord called with unknown id. |
retrieveRecord on an unknown id returns null (not throw). archiveRecord throws, because archiving a non-existent row is a caller bug.
5. Invariants
I1 — Forward-only transitions
A record may move hot → warm, hot → cold, or warm → cold. It MUST NOT move backwards. archiveRecord enforces this by recomputing the target zone from the record’s current position:
- Hot row whose position is now > 100 → warm or cold (depending on position).
- Warm row whose position is now > 1000 → cold.
- Cold row stays cold.
- Hot row still at position ≤ 100 → no-op (already hot).
- Warm row still at position 101–1000 → no-op.
If the computed target equals the current zone, archiveRecord returns { changed: false, from_zone, to_zone: from_zone } without a DB write.
I2 — Idempotency
Two consecutive calls to archiveRecord(id) produce the same end-state. The second call observes changed: false.
I3 — Chain-integrity preservation
Archival MUST NOT modify:
id,type,task_id,agent_id,timestamp,prev_hash,hash,created_at.
These columns are written once at createThoughtRecord time and are read-only after. Only content, content_compressed, content_hash, and zone are mutable by this module.
I4 — Row preservation
Archival MUST NOT DELETE rows. Content removal is NULL assignment, not row deletion, so subsequent chain verification (P0.7.3, not yet shipped) can still rebuild the hash chain from every id.
I5 — Position stability within a transaction
getRecordPosition(db, id) is consistent within a single SQLite transaction: SQLite’s rowid is monotonic, and we order by rowid DESC over rows with the same task_id. As new records are appended, positions of older records drift larger, but only downward in priority (older positions become bigger numbers).
I6 — Compression round-trip
For any UTF-8 string s, gunzipContent(gzipContent(s)) === s. Proven by a round-trip test on a 64 KiB payload of mixed text.
I7 — content_hash stability
Once set, content_hash equals sha256(original_content) and MUST NOT be recomputed from a decompressed Warm value. (The Warm path stores the hash at transition-time so Cold can preserve it without needing the plaintext.)
6. Acceptance criteria → test mapping
| AC (task-breakdown §P0.8.2) | Test name in retention.test.ts |
|---|---|
| Hot zone: last 100 records — full content in DB | computeZone returns 'hot' for positions 1..100; archiveRecord on position-1 record leaves zone='hot', content unchanged |
| Warm zone: records 101–1000 — content compressed | computeZone returns 'warm' for positions 101..1000; archiveRecord transitions hot→warm: content NULL, content_compressed non-NULL, content_hash non-NULL |
| Cold zone: records 1001+ — content hash only, full content deleted | computeZone returns 'cold' for positions >= 1001; archiveRecord transitions to cold: content + content_compressed both NULL, content_hash non-NULL |
| archiveRecord(id) moves record to next zone | archiveRecord hot→warm + archiveRecord warm→cold + archiveRecord hot→cold direct |
| retrieveRecord(id) decompresses if Warm, hash stub if Cold | retrieveRecord returns content for hot; retrieveRecord decompresses warm; retrieveRecord returns hash_only for cold |
| Test: hot → warm → cold transitions; content availability per zone | hot → warm → cold transition sequence preserves hash and flips content availability flags |
Additional non-AC tests that pin invariants I1–I7:
- Idempotency (I2):
archiveRecord on already-warm record returns changed:false and does not re-compress. - Integrity (I3):
archival preserves hash, prev_hash, id, task_id, agent_id, timestamp(snapshot-and-compare). - Row preservation (I4):
archive to cold does not DELETE row; SELECT COUNT(*) unchanged. - Compression round-trip (I6):
gunzipContent(gzipContent(s)) === sfor sample payloads. - Hash stability (I7):
hashContent is deterministic; content_hash set at warm transition matches sha256 of original. - Error path:
retrieveRecord returns null for unknown id. - Error path:
archiveRecord throws for unknown id. - Error path:
RetentionIntegrityError thrown when zone='warm' but content_compressed IS NULL(tampered row). - Zone computation:
computeZone(0) throws; computeZone(1) is hot; computeZone(100) is hot; computeZone(101) is warm; computeZone(1000) is warm; computeZone(1001) is cold.
7. Non-contract (explicitly deferred)
- No MCP tool registration.
retention.tsis a library surface consumed by other Phase 0 code (future). - No bulk / batch archival. Callers iterate record ids themselves.
- No “promote on read” (Cold → Warm on retrieveRecord); transitions are append-only and controlled by archiveRecord alone.
- No metrics / counters. Observability is out of P0.8.2 scope (donor extraction mentions them; task spec does not).
- No index on
zone. Can be added by a later migration if a bulk query pattern emerges.
8. Contract gate
- Public surface enumerated with types + signatures
- Storage contract (columns, defaults, per-zone invariants) specified
- Error classes named
- I1–I7 invariants stated
- Acceptance criteria mapped 1:1 to named tests
- Non-contract fenced
Proceed to packet.