Boot Sequence

The Colibri server boots in a strict 6-step sequence. Understanding this sequence is critical because it explains why the transport layer connects before the database is opened, and how tool calls are queued until the server is ready.

The 6 Steps

Step 1: Create Server

const server = new Server({
  name: 'colibri',
  version: '0.1.0',
});

A new MCP Server object is instantiated from @modelcontextprotocol/sdk. This object is stateless and does not yet have a transport or database.

Step 2: Register Handlers

The server registers the 14 shipped tool handlers:

  • 5 β Task toolstask_create, task_list, task_get, task_update (accepts status; routes through state-machine), task_next_actions
  • 4 ζ Audit toolsaudit_session_start, thought_record, thought_record_list, audit_verify_chain
  • 2 η Proof toolsmerkle_finalize, merkle_root
  • 1 ε Skill toolskill_list
  • 2 α/γ System toolsserver_ping, server_health

Each handler is registered with its Zod input schema. The schema is the contract; a tool without a schema cannot be registered. Handlers at this stage are disconnected from the database — they are just function references.

Step 3: Connect Transport

The stdio transport is connected now, before the database is opened:

const transport = new StdioServerTransport({
  reader: process.stdin,
  writer: process.stdout,
});

await server.connect(transport);

The client immediately sends an MCP initialize request. The server responds with:

{
  "protocolVersion": "2024-11-05",
  "capabilities": { "tools": {} },
  "serverInfo": {
    "name": "colibri",
    "version": "0.1.0"
  }
}

At this point the transport is live and the client can send tool requests. But the database is not yet open. Any tool calls that arrive before Step 5 completes are queued behind a Promise gate.

Why transport first? The MCP handshake has a timeout (typically 10–30 seconds, depending on the client). If we wait to connect the transport until after the database is open, and the database takes longer than the timeout to initialize, the client gives up and closes the connection. By connecting first, we guarantee the handshake completes before the database initialization begins. This prevents the client from timing out during boot.

Step 4: Resolve initReady

The server waits for the MCP initialize handshake to fully complete:

await server.initReady;

This ensures the client has acknowledged the server’s capabilities and is ready to send tool calls. The handshake is now complete.

Step 5: Load Database

Now the database is opened:

const db = openDatabase('data/colibri.db');
db.pragma('journal_mode = WAL');

// Run migrations
runMigrations(db);

// Verify integrity
db.pragma('integrity_check');

The database:

  • Is created if it does not exist.
  • Is opened with WAL (Write-Ahead Log) mode for durability.
  • Runs all pending schema migrations (stored in src/db/schema.sql and any incremental migration files).
  • Verifies integrity with PRAGMA integrity_check.

If any step fails, the server exits with code 75 (resource error). All tool calls queued since Step 3 now begin executing. The Promise gate is released.

Step 6: Load Domains

Each domain loads its state from the database:

  1. ζ Decision Trail — verify the hash chain is intact (first few and last few records).
  2. η Proof Store — verify the Merkle tree structure is intact (spot-check).
  3. ε Skill Registry — parse all .agents/skills/SKILL.md files and register them with the server.
  4. β Task Pipeline — no pre-load needed; task state lives in the database.
  5. α/γ System — start the 30-second health check loop.

At the end of Step 6, the server is in one of 4 runtime modes (determined by COLIBRI_MODE):

  • FULL — all 14 shipped tools active (default).
  • READONLY — read-only tools only.
  • TEST — all tools active with deterministic randomness.
  • MINIMALserver_ping + server_health only.

Queued tool calls now begin executing against a live database.


The Promise Gate

Tool calls that arrive between Step 3 (transport connected) and Step 5 (database open) do not fail. Instead, they are queued behind a Promise gate:

let dbReady: () => void;
const dbReadyPromise = new Promise<void>((resolve) => {
  dbReady = resolve;
});

// In step 5, after database is open:
dbReady();

// In every tool handler, at the start:
await dbReadyPromise;

This means:

  • Client never sees a “server not ready” error. All tool calls succeed or fail on their merits, not due to boot timing.
  • No retry logic needed on the client side. The server handles the buffering internally.
  • Ordering is preserved. If client sends 5 tool calls before boot completes, they execute in order.

In practice, because Step 1–5 typically takes less than 100 milliseconds, the gate rarely matters. But it guarantees that boot timing never causes a tool call to fail.


Boot Failure Paths

If any step fails, the server exits with a specific code:

Step Failure Exit Code Meaning
1 Out of memory 1 Generic error (Node.js killed the process)
2 Handler registration failed 73 CONFIG — one of the 14 tool schemas is malformed
3 Transport connection failed 1 Generic error (e.g., stdin closed)
4 Handshake timeout 1 Client closed connection before handshake completed
5 Database open failed 75 RESOURCE — cannot open data/colibri.db
5 Migration failed 75 RESOURCE — a migration failed (schema corruption?)
5 Integrity check failed 75 RESOURCE — database integrity error
6 Domain load failed 75 RESOURCE — e.g., malformed SKILL.md files

Exit codes follow a pattern:

Code Meaning When
0 Success Clean shutdown via SIGINT/SIGTERM
1 Generic error Unexpected crash, unhandled exception
71 Lock error Process-level lock contention (if using file locks)
73 Config error Configuration validation failure
75 Resource error Database, file, or memory issue

These exit codes are consumed by orchestration tools (systemd, Kubernetes, PM2, etc.) to determine whether to restart automatically or escalate to an operator.


Configuration at Boot

The server reads configuration exactly once at boot from environment variables and an optional .env file:

Variable Default Purpose
COLIBRI_MODE FULL Runtime mode: FULL, READONLY, TEST, MINIMAL
COLIBRI_DB_PATH data/colibri.db Path to SQLite database
COLIBRI_LOG_LEVEL info silent, error, warn, info, debug
COLIBRI_STARTUP_TIMEOUT_MS 30000 Hard ceiling for boot; fail fast if exceeded

Configuration is frozen after boot. No tool call can change the mode, database path, or log level. If a config change is needed, the server must be restarted.

No AMS_* variables are read. The donor runtime used AMS_DB_PATH, AMS_MODE, AMS_LOG_LEVEL, etc. Colibri uses COLIBRI_* only.


Startup Timeout

The COLIBRI_STARTUP_TIMEOUT_MS variable enforces a hard ceiling on boot. If any step takes longer than this timeout, the server logs an error and exits with code 75 (resource):

const deadline = Date.now() + startupTimeout;

// Periodically check:
if (Date.now() > deadline) {
  console.error('Boot timeout exceeded');
  process.exit(75);
}

This prevents a hung migration or stuck domain load from blocking the server forever. Default is 30 seconds; this is typically generous for a local SQLite database.


Health Check Loop

At the end of Step 6, the server starts a health check loop that runs every 30 seconds:

  1. Sample SQLite PRAGMA integrity_check.
  2. Check memory usage.
  3. Measure event loop latency.
  4. Verify watcher status.
  5. Check task queue depth.

If a health check fails, the server transitions to SAFE_MODE or DIAGNOSE mode, reducing the available tool set to read-only or diagnostics-only. This prevents a degraded server from corrupting the database.


See Also


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.