Contract: P0.8.2 η Three-Zone Retention

Task: P0.8.2 Branch: feature/p0-8-2-retention Date: 2026-04-17 Depends on: P0.8.1 (#136), P0.2.2 (Wave B)


1. Purpose

Ship a forward-only three-zone retention policy over ζ thought_records:

  • Hot — newest 100 records per task chain, full content in DB.
  • Warm — positions 101–1000, content gzip+base64-compressed into content_compressed; plain content NULLed.
  • Cold — positions 1001+, content_hash preserved, both content and content_compressed NULLed.

Provides two operations on individual records (archiveRecord, retrieveRecord) plus pure helpers. Chain-integrity hashes (id, type, task_id, timestamp, prev_hash, hash) are never modified — only the mutable content payload is rearranged per zone.


2. Public surface (src/domains/proof/retention.ts)

2.1 Types

export type RetentionZone = 'hot' | 'warm' | 'cold';

/**
 * Payload returned by `retrieveRecord`. Exactly one of the three shapes
 * is returned based on the record's current zone.
 */
export type RetrieveResult =
  | {
      id: string;
      zone: 'hot';
      content: string;
      content_available: true;
    }
  | {
      id: string;
      zone: 'warm';
      content: string;             // decompressed from content_compressed
      content_available: true;
    }
  | {
      id: string;
      zone: 'cold';
      hash_only: string;           // 64-char sha256 hex of original content
      content_available: false;
    };

2.2 Zod schemas

export const RETENTION_ZONES = ['hot', 'warm', 'cold'] as const;

export const RetentionZoneSchema = z.enum(RETENTION_ZONES);

// Hot: [1..100] per task_id (newest = position 1)
// Warm: [101..1000]
// Cold: [1001..)
export const HOT_MAX_POSITION = 100;
export const WARM_MAX_POSITION = 1000;

export const ArchiveRecordInputSchema = z.object({
  id: z.string().min(1),
});

export const RetrieveRecordInputSchema = z.object({
  id: z.string().min(1),
});

2.3 Exports

Name Signature Purpose
RETENTION_ZONES readonly ['hot','warm','cold'] Canonical zone tuple.
HOT_MAX_POSITION 100 Boundary constant.
WARM_MAX_POSITION 1000 Boundary constant.
RetentionZone type 'hot' \| 'warm' \| 'cold'.
RetrieveResult type Discriminated union on zone.
RetentionZoneSchema z.enum(...) Zod validator for zone strings.
computeZone(position) (position: number) => RetentionZone Pure: 1..100 → hot; 101..1000 → warm; 1001+ → cold; throws on position < 1.
getRecordPosition(db, id) (db, id: string) => number \| null Position (1-indexed, newest = 1) within same task_id ordered by rowid DESC. null if record id unknown.
hashContent(content) (content: string) => string SHA-256 lowercase hex of content bytes. 64 chars.
gzipContent(content) (content: string) => string gzipSync then base64-encode.
gunzipContent(compressed) (compressed: string) => string base64-decode then gunzipSync.
archiveRecord(db, id) (db, id: string) => { id, from_zone, to_zone, changed: boolean } Transition record to its computed zone. Idempotent.
retrieveRecord(db, id) (db, id: string) => RetrieveResult \| null Zone-aware read. null if id unknown.

Each db parameter is a Database.Database from better-sqlite3; functions take it explicitly (dependency injection) so tests can use :memory: databases. This mirrors the P0.7.2 repository style.


3. Storage contract

3.1 Schema delta (005_retention.sql)

ALTER TABLE thought_records ADD COLUMN zone TEXT;
ALTER TABLE thought_records ADD COLUMN content_compressed TEXT;
ALTER TABLE thought_records ADD COLUMN content_hash TEXT;

All three are NULL-able. Existing rows get NULL in all three columns — the repository treats zone IS NULL as “hot / not yet archived”.

No index added on zone in this migration (audit §4 note).

3.2 Per-zone row invariants

State zone content content_compressed content_hash
Fresh write (P0.7.2) NULL original string NULL NULL
After archiveRecord → hot 'hot' original string NULL NULL
After archiveRecord → warm 'warm' NULL base64(gzip(original)) sha256(original)
After archiveRecord → cold 'cold' NULL NULL sha256(original)

content_hash once set is immutable. It exists from the moment the record transitions away from Hot, and remains once in Cold.

3.3 Read-path behaviour

retrieveRecord(db, id):

  • If zone IN (NULL, 'hot'): return { zone: 'hot', content, content_available: true }.
  • If zone = 'warm': gunzip content_compressed, return { zone: 'warm', content: decompressed, content_available: true }. If content_compressed IS NULL while zone='warm', throw RetentionIntegrityError — the table is inconsistent.
  • If zone = 'cold': return { zone: 'cold', hash_only: content_hash, content_available: false }. If content_hash IS NULL while zone='cold', throw RetentionIntegrityError.
  • If record id unknown: return null.

4. Error classes

Error When
ZodError Invalid input (empty id).
RetentionIntegrityError (new class exported) Zone column is set but required companion column (content_compressed for warm, content_hash for cold) is NULL.
Error("Record not found: <id>") archiveRecord called with unknown id.

retrieveRecord on an unknown id returns null (not throw). archiveRecord throws, because archiving a non-existent row is a caller bug.


5. Invariants

I1 — Forward-only transitions

A record may move hot → warm, hot → cold, or warm → cold. It MUST NOT move backwards. archiveRecord enforces this by recomputing the target zone from the record’s current position:

  • Hot row whose position is now > 100 → warm or cold (depending on position).
  • Warm row whose position is now > 1000 → cold.
  • Cold row stays cold.
  • Hot row still at position ≤ 100 → no-op (already hot).
  • Warm row still at position 101–1000 → no-op.

If the computed target equals the current zone, archiveRecord returns { changed: false, from_zone, to_zone: from_zone } without a DB write.

I2 — Idempotency

Two consecutive calls to archiveRecord(id) produce the same end-state. The second call observes changed: false.

I3 — Chain-integrity preservation

Archival MUST NOT modify:

  • id, type, task_id, agent_id, timestamp, prev_hash, hash, created_at.

These columns are written once at createThoughtRecord time and are read-only after. Only content, content_compressed, content_hash, and zone are mutable by this module.

I4 — Row preservation

Archival MUST NOT DELETE rows. Content removal is NULL assignment, not row deletion, so subsequent chain verification (P0.7.3, not yet shipped) can still rebuild the hash chain from every id.

I5 — Position stability within a transaction

getRecordPosition(db, id) is consistent within a single SQLite transaction: SQLite’s rowid is monotonic, and we order by rowid DESC over rows with the same task_id. As new records are appended, positions of older records drift larger, but only downward in priority (older positions become bigger numbers).

I6 — Compression round-trip

For any UTF-8 string s, gunzipContent(gzipContent(s)) === s. Proven by a round-trip test on a 64 KiB payload of mixed text.

I7 — content_hash stability

Once set, content_hash equals sha256(original_content) and MUST NOT be recomputed from a decompressed Warm value. (The Warm path stores the hash at transition-time so Cold can preserve it without needing the plaintext.)


6. Acceptance criteria → test mapping

AC (task-breakdown §P0.8.2) Test name in retention.test.ts
Hot zone: last 100 records — full content in DB computeZone returns 'hot' for positions 1..100; archiveRecord on position-1 record leaves zone='hot', content unchanged
Warm zone: records 101–1000 — content compressed computeZone returns 'warm' for positions 101..1000; archiveRecord transitions hot→warm: content NULL, content_compressed non-NULL, content_hash non-NULL
Cold zone: records 1001+ — content hash only, full content deleted computeZone returns 'cold' for positions >= 1001; archiveRecord transitions to cold: content + content_compressed both NULL, content_hash non-NULL
archiveRecord(id) moves record to next zone archiveRecord hot→warm + archiveRecord warm→cold + archiveRecord hot→cold direct
retrieveRecord(id) decompresses if Warm, hash stub if Cold retrieveRecord returns content for hot; retrieveRecord decompresses warm; retrieveRecord returns hash_only for cold
Test: hot → warm → cold transitions; content availability per zone hot → warm → cold transition sequence preserves hash and flips content availability flags

Additional non-AC tests that pin invariants I1–I7:

  • Idempotency (I2): archiveRecord on already-warm record returns changed:false and does not re-compress.
  • Integrity (I3): archival preserves hash, prev_hash, id, task_id, agent_id, timestamp (snapshot-and-compare).
  • Row preservation (I4): archive to cold does not DELETE row; SELECT COUNT(*) unchanged.
  • Compression round-trip (I6): gunzipContent(gzipContent(s)) === s for sample payloads.
  • Hash stability (I7): hashContent is deterministic; content_hash set at warm transition matches sha256 of original.
  • Error path: retrieveRecord returns null for unknown id.
  • Error path: archiveRecord throws for unknown id.
  • Error path: RetentionIntegrityError thrown when zone='warm' but content_compressed IS NULL (tampered row).
  • Zone computation: computeZone(0) throws; computeZone(1) is hot; computeZone(100) is hot; computeZone(101) is warm; computeZone(1000) is warm; computeZone(1001) is cold.

7. Non-contract (explicitly deferred)

  • No MCP tool registration. retention.ts is a library surface consumed by other Phase 0 code (future).
  • No bulk / batch archival. Callers iterate record ids themselves.
  • No “promote on read” (Cold → Warm on retrieveRecord); transitions are append-only and controlled by archiveRecord alone.
  • No metrics / counters. Observability is out of P0.8.2 scope (donor extraction mentions them; task spec does not).
  • No index on zone. Can be added by a later migration if a bulk query pattern emerges.

8. Contract gate

  • Public surface enumerated with types + signatures
  • Storage contract (columns, defaults, per-zone invariants) specified
  • Error classes named
  • I1–I7 invariants stated
  • Acceptance criteria mapped 1:1 to named tests
  • Non-contract fenced

Proceed to packet.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.