Backup & Restore

Phase 0 Status: This document describes target behavior. No Colibri TypeScript code exists yet. Implementation begins at Phase 0. The commands below assume the Phase 0 server entry point (src/server.tsdist/server.js) and primary database (data/colibri.db) have been created by P0.2.1 and P0.2.2.

What the backup covers

Phase 0 Colibri is a single-writer SQLite system. All durable state lives in data/colibri.db — tasks, skills, thought records, Merkle tree, audit chain. A backup of that one file is a full backup of the system.

Three tables are load-bearing and must round-trip without loss:

Table Role
tasks β pipeline state (status, progress, deps)
thought_records ζ decision trail (audit chain, HMAC-linked)
merkle_nodes η proof tree (canonical hash roots)

If any of these three is missing or corrupt after restore, the backup failed.

SQLite WAL mode note

Phase 0 runs SQLite in WAL mode (set at boot by src/db/index.ts). A live DB has three files on disk:

  • data/colibri.db (main)
  • data/colibri.db-wal (write-ahead log)
  • data/colibri.db-shm (shared memory index)

A naive cp data/colibri.db … while the server is running captures the main file without the WAL, losing recent writes. Use the SQLite .backup command instead — it is WAL-aware and produces a consistent snapshot without stopping the server.

Backup command (canonical)

sqlite3 data/colibri.db ".backup data/backups/colibri-<round>-<date>.db"

Example:

mkdir -p data/backups
sqlite3 data/colibri.db ".backup data/backups/colibri-r75-20260416.db"

The <round> slug (e.g. r75) lets you tie a snapshot to a sealed round; <date> is YYYYMMDD.

Cadence

Trigger Retention tier Action
End of every round (Sigma seal) hot Fresh .backup into data/backups/
End of every session hot Fresh .backup into data/backups/
Phase seal (Phase 0 → Phase 1, etc.) cold Frozen snapshot, copied off-host
Ad-hoc before destructive migration hot Extra .backup named colibri-premigrate-…

Retention tiers

Tier Location Age Policy
hot data/backups/ 0–7 days Keep every round + session snapshot
warm data/backups/ 7–30 days Keep one per round only
cold Off-host (external drive, cloud) ≥ phase seal Keep forever; one per phase seal

Integrity check

After every backup, and before every restore, run:

sqlite3 data/backups/colibri-r75-20260416.db "PRAGMA integrity_check;"

Expected output: exactly ok.

Any other result (missing pages, malformed index, row/page checksum mismatch) means the snapshot is not safe to restore from. Discard it and fall back to the previous hot-tier snapshot.

For a deeper check that also validates foreign keys:

sqlite3 data/backups/colibri-r75-20260416.db "PRAGMA foreign_key_check;"

Expected output: empty (no violations).

Restore runbook

Phase 0 has a single MCP stdio server with no hot-standby. Restore is a stop-the-world operation.

  1. Stop the Colibri server. Send SIGINT to the process; the signal handler runs writeback, seals the active Merkle tree, closes the DB, and exits. Wait for the process to fully terminate.
  2. Verify the backup you intend to restore. Run PRAGMA integrity_check; on the candidate file (see above). If the result is not ok, pick a different snapshot.
  3. Move the current DB aside (do not delete — it may be needed for forensic diff):
    mv data/colibri.db data/colibri.db.broken-$(date +%Y%m%d-%H%M%S)
    rm -f data/colibri.db-wal data/colibri.db-shm
    
  4. Copy the backup into place:
    cp data/backups/colibri-r75-20260416.db data/colibri.db
    
  5. Re-check integrity on the restored file:
    sqlite3 data/colibri.db "PRAGMA integrity_check;"
    
  6. Restart the server (node dist/server.js, or the .vscode/mcp-settings.example.json launcher). First boot after restore re-opens the DB in WAL mode and rebuilds colibri.db-wal / colibri.db-shm from scratch.
  7. Verify the audit chain by calling audit_verify_chain via the MCP client. A clean restore returns ok. A break_at index means the chosen backup pre-dates a chain extension and cannot be trusted — try a newer snapshot.
  8. Verify the Merkle tree by calling merkle_root and comparing against the externally-stored root for that snapshot (see “External root anchoring” below).

Corruption detection

Signs the live DB is corrupt:

  • Startup logs show SQLITE_CORRUPT or SQLITE_NOTADB.
  • PRAGMA integrity_check; returns anything other than ok.
  • Tool calls return SQLITE_ERROR: database disk image is malformed.
  • audit_verify_chain returns a break_at index from a tool call that should have extended the chain cleanly.

If any of these fire, stop writing immediately — SQLite does not self-heal a corrupt page — and go to the restore runbook.

Recovery from corruption

  1. Do not attempt repair-in-place. Phase 0 has no DB-repair tool (the donor-era npm run db:repair is not a Phase 0 feature).
  2. Follow the restore runbook against the most recent hot-tier snapshot that passes PRAGMA integrity_check.
  3. Keep the corrupt file (data/colibri.db.broken-…) for root-cause analysis. Do not overwrite it.
  4. If no hot-tier snapshot is clean, walk back through warm, then cold. Each generation you walk back loses work that occurred after that snapshot — log the loss in a thought_record at next boot so the gap is visible in the audit chain.

External root anchoring

Merkle roots finalized by merkle_finalize are the proof anchors for what the state was at a given moment. They are small (a single hash) and cheap to store off-host.

At every round seal, record the finalized root somewhere outside data/colibri.db:

  • A line in the round’s seal document under docs/.
  • A line in the session seal.
  • Optionally an external append-only log (file, signed commit).

If the DB is lost entirely and no backup restores cleanly, the external root still proves what the canonical state was at the anchor point, even if the corresponding records cannot be recovered.

What NOT to back up

Path Why skip
.worktrees/ Ephemeral per-task feature branches (SCRATCH zone)
temp/ Round staging + vault staging (SCRATCH zone; gitignored)
node_modules/ Re-installable from package-lock.json
data/backups/ Don’t back up backups into themselves; use off-host for cold tier
data/colibri.db-wal, data/colibri.db-shm Transient; recreated at next DB open

Heritage note

data/ams.db (72 MB) is donor runtime state from pre-R53 AMS. It is kept read-only during Phase 0 bootstrap as a task-store + writeback target until R78, after which it is frozen. It is not a backup target for Phase 0 and is not the primary DB. Do not restore into it and do not point Phase 0 code at it. Its presence in the tree is HERITAGE zone genealogy, not a live fallback. See data/README.md for the zone rules.


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.