Packet: P0.8.2 η Three-Zone Retention

Task: P0.8.2 Branch: feature/p0-8-2-retention Date: 2026-04-17 Base: origin/main@dc660381


1. Files

1.1 Create

Path Notes
src/db/migrations/005_retention.sql 3 ALTER TABLE statements adding zone, content_compressed, content_hash to thought_records. All nullable, no defaults — the app-layer assigns defaults on write.
src/domains/proof/retention.ts Implements contract §2 exports. Pure module: no top-level state, no console, no env reads. Depends on node:crypto, node:zlib, better-sqlite3 (type only), zod.
src/__tests__/domains/proof/retention.test.ts Jest test file using the P0.7.2 in-memory-DB pattern. Loads BOTH 003_thought_records.sql AND 005_retention.sql per test.
docs/audits/p0-8-2-retention-audit.md SHIPPED Step 1.
docs/contracts/p0-8-2-retention-contract.md SHIPPED Step 2.
docs/verification/p0-8-2-retention-verification.md Step 5.

1.2 Modify

None. src/db/index.ts already auto-discovers any NNN_*.sql file in src/db/migrations/ and applies it in order; we add migration 005 without touching the runner.

src/domains/proof/merkle.tsnot modified. Only consumed indirectly via the η domain directory sharing.

src/domains/trail/repository.tsnot modified. P0.7.2 createThoughtRecord writes no zone column — left NULL — and is treated as Hot by retrieveRecord.

src/db/migrations/003_thought_records.sqlnot modified. The migration semantics guarantee pre-005 rows see NULL in the new columns.


2. Migration (005_retention.sql)

-- 005_retention — η Proof Store three-zone retention schema delta (P0.8.2).
--
-- Adds three nullable columns to thought_records:
--   zone                TEXT — 'hot'|'warm'|'cold' (NULL = legacy hot).
--   content_compressed  TEXT — base64(gzip(content)) for warm rows.
--   content_hash        TEXT — sha256(content) lowercase hex; set at first
--                              transition away from hot and preserved in cold.
--
-- Values are written by src/domains/proof/retention.ts; this migration only
-- introduces the storage.
--
-- Canonical references:
--   - docs/guides/implementation/task-breakdown.md § P0.8.2
--   - docs/audits/p0-8-2-retention-audit.md §4
--   - docs/contracts/p0-8-2-retention-contract.md §3
--   - docs/packets/p0-8-2-retention-packet.md §2

ALTER TABLE thought_records ADD COLUMN zone TEXT;
ALTER TABLE thought_records ADD COLUMN content_compressed TEXT;
ALTER TABLE thought_records ADD COLUMN content_hash TEXT;

No CHECK constraint on zone; the Zod layer in retention.ts owns shape validation. No default value; new rows from P0.7.2 repository (createThoughtRecord) keep NULL, and retrieveRecord treats NULL as hot.


3. src/domains/proof/retention.ts skeleton

/**
 * Colibri — Phase 0 η Proof Store: Three-Zone Retention (P0.8.2).
 *
 * Forward-only retention state machine over ζ thought_records:
 *   hot (positions 1..100)   → warm (101..1000)   → cold (1001+)
 *
 * Zones are computed from rowid position within the same task_id chain.
 * Archival compresses (warm) or hash-only (cold) the record's content while
 * preserving every column that participates in chain integrity.
 */

import type Database from 'better-sqlite3';
import { createHash } from 'node:crypto';
import { gunzipSync, gzipSync } from 'node:zlib';
import { z } from 'zod';

// Types, constants, RETENTION_ZONES, HOT_MAX_POSITION, WARM_MAX_POSITION
// RetentionIntegrityError extends Error { constructor(msg) }
// computeZone(position) — throws if position < 1
// getRecordPosition(db, id) — SELECT + ORDER BY rowid DESC
// hashContent(content) — sha256 hex
// gzipContent / gunzipContent — node:zlib + base64
// archiveRecord(db, id) — transaction: lookup row, compute zone, UPDATE
// retrieveRecord(db, id) — SELECT, dispatch on zone

3.1 getRecordPosition SQL

SELECT COUNT(*) + 1 AS position
  FROM thought_records
 WHERE task_id = :task_id
   AND rowid  > :target_rowid

“Number of rows in the same chain that are newer than this one, plus 1.” Position 1 = newest; position N = oldest. Uses rowid, not created_at (P0.7.2 lesson).

Implementation: two prepared statements inside getRecordPosition:

  • SELECT rowid, task_id FROM thought_records WHERE id = ?
  • SELECT COUNT(*) FROM thought_records WHERE task_id = ? AND rowid > ?

3.2 archiveRecord state machine

function archiveRecord(db, id) {
  return db.transaction(() => {
    const row = db.prepare(
      'SELECT id, task_id, content, zone, content_compressed, content_hash FROM thought_records WHERE id = ?'
    ).get(id);
    if (!row) throw new Error(`Record not found: ${id}`);

    const position = computePositionInline(db, row.task_id, row.rowid);
    const target = computeZone(position);
    const current = normalizeZone(row.zone);  // NULL → 'hot'

    if (current === target) return { id, from_zone: current, to_zone: target, changed: false };

    if (target === 'warm') {
      // current === 'hot'
      const compressed = gzipContent(row.content);
      const hash = row.content_hash ?? hashContent(row.content);
      db.prepare('UPDATE thought_records SET zone=?, content=NULL, content_compressed=?, content_hash=? WHERE id=?').run('warm', compressed, hash, id);
    } else if (target === 'cold') {
      let hash = row.content_hash;
      if (!hash) {
        // current could be hot (content present) or warm (content_compressed present)
        if (row.content !== null) hash = hashContent(row.content);
        else if (row.content_compressed !== null) hash = hashContent(gunzipContent(row.content_compressed));
        else throw new RetentionIntegrityError(...);
      }
      db.prepare('UPDATE thought_records SET zone=?, content=NULL, content_compressed=NULL, content_hash=? WHERE id=?').run('cold', hash, id);
    }
    return { id, from_zone: current, to_zone: target, changed: true };
  })();
}

Note: row.rowid comes from a SELECT that includes rowid explicitly (SQLite returns it when requested).

3.3 retrieveRecord

function retrieveRecord(db, id) {
  const row = db.prepare(
    'SELECT id, zone, content, content_compressed, content_hash FROM thought_records WHERE id = ?'
  ).get(id);
  if (!row) return null;
  const zone = normalizeZone(row.zone);
  if (zone === 'hot') {
    return { id, zone: 'hot', content: row.content, content_available: true };
  }
  if (zone === 'warm') {
    if (row.content_compressed === null) throw new RetentionIntegrityError(...);
    return { id, zone: 'warm', content: gunzipContent(row.content_compressed), content_available: true };
  }
  // cold
  if (row.content_hash === null) throw new RetentionIntegrityError(...);
  return { id, zone: 'cold', hash_only: row.content_hash, content_available: false };
}

3.4 Helpers

function normalizeZone(z: string | null): RetentionZone {
  return z === null || z === 'hot' ? 'hot' : (z as RetentionZone);
}

export function computeZone(position: number): RetentionZone {
  if (!Number.isInteger(position) || position < 1) {
    throw new Error(`computeZone: position must be a positive integer, got ${position}`);
  }
  if (position <= HOT_MAX_POSITION) return 'hot';
  if (position <= WARM_MAX_POSITION) return 'warm';
  return 'cold';
}

export function hashContent(content: string): string {
  return createHash('sha256').update(content, 'utf8').digest('hex');
}

export function gzipContent(content: string): string {
  return gzipSync(Buffer.from(content, 'utf8')).toString('base64');
}

export function gunzipContent(compressed: string): string {
  return gunzipSync(Buffer.from(compressed, 'base64')).toString('utf8');
}

4. Test plan (retention.test.ts)

Test groups following the P0.7.2 pattern:

  1. Migration fixture — load 003_thought_records.sql AND 005_retention.sql per test via module-scope readFileSync + beforeEach. Seed helper uses real createThoughtRecord from the trail repository for realistic rows.

  2. computeZone (pure)
    • position 0 throws
    • position 1 → hot
    • position 100 → hot
    • position 101 → warm
    • position 1000 → warm
    • position 1001 → cold
    • large position (100_000) → cold
  3. Compression helpers
    • gunzipContent(gzipContent(s)) === s for: empty, ASCII, unicode, 64-KiB random-ish.
    • hashContent('') pinned to e3b0c442...b855 (known sha256 of empty).
    • hashContent deterministic (same input → same output).
  4. getRecordPosition
    • Unknown id → null.
    • Single record per task → position 1.
    • Two records same task, fetching the older → position 2; newer → position 1.
  5. archiveRecord — hot path (no-op)
    • Create 1 record. archiveRecord on it → changed:false, row unchanged, content preserved.
  6. archiveRecord — hot → warm
    • Seed 101 records (via createThoughtRecord) for the same task. Archive the oldest (position 101) → zone=’warm’, content=NULL, content_compressed!=NULL, content_hash!=NULL. Chain hash preserved.
    • Decompressed content round-trips to original.
  7. archiveRecord — warm → cold
    • Start from a warm row (from previous step pattern — in-test seed). Add 900 more records so that original now sits at position 1001. Archive → zone=’cold’, content=NULL, content_compressed=NULL, content_hash preserved.
  8. archiveRecord — hot → cold direct
    • Seed 1001 records. Archive the oldest directly from hot → cold. content_hash equals sha256(original_content).
  9. archiveRecord — idempotency (I2)
    • Archive same id twice. Second call returns changed:false. Row bytes unchanged by the second call (compare content_compressed before/after).
  10. Chain integrity (I3)
    • Capture {id, hash, prev_hash, type, task_id, agent_id, timestamp, created_at} pre-archive.
    • Archive hot→warm and warm→cold.
    • Re-read row; all 8 fields byte-identical.
  11. Row preservation (I4)
    • SELECT COUNT(*) unchanged after any archival operation.
  12. retrieveRecord dispatch
    • Unknown id → null.
    • Hot record (fresh insert) → {zone:'hot', content, content_available:true}.
    • Warm record → decompresses; content_available: true.
    • Cold record → {zone:'cold', hash_only, content_available:false}.
  13. Integrity error paths
    • Tampered row: zone=’warm’, content_compressed=NULL → retrieveRecord throws RetentionIntegrityError.
    • Tampered row: zone=’cold’, content_hash=NULL → retrieveRecord throws RetentionIntegrityError.
    • archiveRecord on unknown id → throws Record not found.
  14. End-to-end hot → warm → cold sequence (acceptance criterion)
    • Create 1 record. Assert hot (position 1). Retrieve content.
    • Seed 100 more (total 101). Archive oldest → warm. Retrieve decompressed.
    • Seed 900 more (total 1001). Archive same record → cold. Retrieve → hash_only.
    • content_hash at cold equals the original hashContent(original_content).

Target: ~30–40 assertions across the above groups.

Seeding 1001 records may take a moment on Windows CI — use db.transaction around the bulk insert to keep it fast. Rough cost estimate: <300ms per test that seeds 1001, within Jest defaults.


5. Commit plan

One commit for Step 4 (per CLAUDE.md §6), batching the three new files:

  • src/db/migrations/005_retention.sql
  • src/domains/proof/retention.ts
  • src/__tests__/domains/proof/retention.test.ts

Commit message: feat(p0-8-2): η three-zone retention (hot/warm/cold).

Then Step 5 Verify: one commit for docs/verification/p0-8-2-retention-verification.md.


6. Rollback strategy

If the tests fail or CI breaks:

  1. Local fix, re-commit (not amend), re-push. Never amend.
  2. If the migration is wrong, add a 006_retention_fix.sql rather than mutating 005 after merge. Pre-merge we can still rewrite 005.
  3. No data migration needed: since zero live Colibri DBs exist yet (Phase 0 pre-production), running the migration against an empty DB is trivial; pre-existing test DBs are all :memory: and thrown away after each test.

No downgrade path is provided — SQLite cannot DROP columns in a single statement, and there’s no production data to lose.


7. Known hazards revisited

  • Cross-worktree leak — addressed: git status was clean on entry; will re-check after each step.
  • rowid ordering — addressed: every SQL in retention.ts uses rowid, never created_at.
  • Jest ESM + zod — no jest.isolateModulesAsync used. Tests use direct imports.
  • better-sqlite3 + empty exec — 005 migration has real SQL (3 ALTERs), safe.
  • Sandboxed Write on new file — if Write fails, fall back to touch && Edit.
  • CHECK constraint — intentionally omitted on zone per existing codebase pattern (the type column in 003_thought_records.sql also has no CHECK).

8. Acceptance → artifact map

Packet section Gate
§1 Files Audit §2 + §3
§2 Migration Contract §3.1
§3 retention.ts Contract §2 exports
§4 Tests Contract §6
§5 Commits CLAUDE.md §6
§6 Rollback Contract §5 invariants

Ready to implement.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.