Boot Sequence
The Colibri server boots in a strict 6-step sequence. Understanding this sequence is critical because it explains why the transport layer connects before the database is opened, and how tool calls are queued until the server is ready.
The 6 Steps
Step 1: Create Server
const server = new Server({
name: 'colibri',
version: '0.1.0',
});
A new MCP Server object is instantiated from @modelcontextprotocol/sdk. This object is stateless and does not yet have a transport or database.
Step 2: Register Handlers
The server registers the 14 shipped tool handlers:
- 5 β Task tools —
task_create,task_list,task_get,task_update(acceptsstatus; routes through state-machine),task_next_actions - 4 ζ Audit tools —
audit_session_start,thought_record,thought_record_list,audit_verify_chain - 2 η Proof tools —
merkle_finalize,merkle_root - 1 ε Skill tool —
skill_list - 2 α/γ System tools —
server_ping,server_health
Each handler is registered with its Zod input schema. The schema is the contract; a tool without a schema cannot be registered. Handlers at this stage are disconnected from the database — they are just function references.
Step 3: Connect Transport
The stdio transport is connected now, before the database is opened:
const transport = new StdioServerTransport({
reader: process.stdin,
writer: process.stdout,
});
await server.connect(transport);
The client immediately sends an MCP initialize request. The server responds with:
{
"protocolVersion": "2024-11-05",
"capabilities": { "tools": {} },
"serverInfo": {
"name": "colibri",
"version": "0.1.0"
}
}
At this point the transport is live and the client can send tool requests. But the database is not yet open. Any tool calls that arrive before Step 5 completes are queued behind a Promise gate.
Why transport first? The MCP handshake has a timeout (typically 10–30 seconds, depending on the client). If we wait to connect the transport until after the database is open, and the database takes longer than the timeout to initialize, the client gives up and closes the connection. By connecting first, we guarantee the handshake completes before the database initialization begins. This prevents the client from timing out during boot.
Step 4: Resolve initReady
The server waits for the MCP initialize handshake to fully complete:
await server.initReady;
This ensures the client has acknowledged the server’s capabilities and is ready to send tool calls. The handshake is now complete.
Step 5: Load Database
Now the database is opened:
const db = openDatabase('data/colibri.db');
db.pragma('journal_mode = WAL');
// Run migrations
runMigrations(db);
// Verify integrity
db.pragma('integrity_check');
The database:
- Is created if it does not exist.
- Is opened with WAL (Write-Ahead Log) mode for durability.
- Runs all pending schema migrations (stored in
src/db/schema.sqland any incremental migration files). - Verifies integrity with
PRAGMA integrity_check.
If any step fails, the server exits with code 75 (resource error). All tool calls queued since Step 3 now begin executing. The Promise gate is released.
Step 6: Load Domains
Each domain loads its state from the database:
- ζ Decision Trail — verify the hash chain is intact (first few and last few records).
- η Proof Store — verify the Merkle tree structure is intact (spot-check).
- ε Skill Registry — parse all
.agents/skills/SKILL.mdfiles and register them with the server. - β Task Pipeline — no pre-load needed; task state lives in the database.
- α/γ System — start the 30-second health check loop.
At the end of Step 6, the server is in one of 4 runtime modes (determined by COLIBRI_MODE):
FULL— all 14 shipped tools active (default).READONLY— read-only tools only.TEST— all tools active with deterministic randomness.MINIMAL—server_ping+server_healthonly.
Queued tool calls now begin executing against a live database.
The Promise Gate
Tool calls that arrive between Step 3 (transport connected) and Step 5 (database open) do not fail. Instead, they are queued behind a Promise gate:
let dbReady: () => void;
const dbReadyPromise = new Promise<void>((resolve) => {
dbReady = resolve;
});
// In step 5, after database is open:
dbReady();
// In every tool handler, at the start:
await dbReadyPromise;
This means:
- Client never sees a “server not ready” error. All tool calls succeed or fail on their merits, not due to boot timing.
- No retry logic needed on the client side. The server handles the buffering internally.
- Ordering is preserved. If client sends 5 tool calls before boot completes, they execute in order.
In practice, because Step 1–5 typically takes less than 100 milliseconds, the gate rarely matters. But it guarantees that boot timing never causes a tool call to fail.
Boot Failure Paths
If any step fails, the server exits with a specific code:
| Step | Failure | Exit Code | Meaning |
|---|---|---|---|
| 1 | Out of memory | 1 | Generic error (Node.js killed the process) |
| 2 | Handler registration failed | 73 | CONFIG — one of the 14 tool schemas is malformed |
| 3 | Transport connection failed | 1 | Generic error (e.g., stdin closed) |
| 4 | Handshake timeout | 1 | Client closed connection before handshake completed |
| 5 | Database open failed | 75 | RESOURCE — cannot open data/colibri.db |
| 5 | Migration failed | 75 | RESOURCE — a migration failed (schema corruption?) |
| 5 | Integrity check failed | 75 | RESOURCE — database integrity error |
| 6 | Domain load failed | 75 | RESOURCE — e.g., malformed SKILL.md files |
Exit codes follow a pattern:
| Code | Meaning | When |
|---|---|---|
| 0 | Success | Clean shutdown via SIGINT/SIGTERM |
| 1 | Generic error | Unexpected crash, unhandled exception |
| 71 | Lock error | Process-level lock contention (if using file locks) |
| 73 | Config error | Configuration validation failure |
| 75 | Resource error | Database, file, or memory issue |
These exit codes are consumed by orchestration tools (systemd, Kubernetes, PM2, etc.) to determine whether to restart automatically or escalate to an operator.
Configuration at Boot
The server reads configuration exactly once at boot from environment variables and an optional .env file:
| Variable | Default | Purpose |
|---|---|---|
COLIBRI_MODE |
FULL |
Runtime mode: FULL, READONLY, TEST, MINIMAL |
COLIBRI_DB_PATH |
data/colibri.db |
Path to SQLite database |
COLIBRI_LOG_LEVEL |
info |
silent, error, warn, info, debug |
COLIBRI_STARTUP_TIMEOUT_MS |
30000 | Hard ceiling for boot; fail fast if exceeded |
Configuration is frozen after boot. No tool call can change the mode, database path, or log level. If a config change is needed, the server must be restarted.
No AMS_* variables are read. The donor runtime used AMS_DB_PATH, AMS_MODE, AMS_LOG_LEVEL, etc. Colibri uses COLIBRI_* only.
Startup Timeout
The COLIBRI_STARTUP_TIMEOUT_MS variable enforces a hard ceiling on boot. If any step takes longer than this timeout, the server logs an error and exits with code 75 (resource):
const deadline = Date.now() + startupTimeout;
// Periodically check:
if (Date.now() > deadline) {
console.error('Boot timeout exceeded');
process.exit(75);
}
This prevents a hung migration or stuck domain load from blocking the server forever. Default is 30 seconds; this is typically generous for a local SQLite database.
Health Check Loop
At the end of Step 6, the server starts a health check loop that runs every 30 seconds:
- Sample SQLite
PRAGMA integrity_check. - Check memory usage.
- Measure event loop latency.
- Verify watcher status.
- Check task queue depth.
If a health check fails, the server transitions to SAFE_MODE or DIAGNOSE mode, reducing the available tool set to read-only or diagnostics-only. This prevents a degraded server from corrupting the database.
See Also
- Database: data/colibri.db — migrations and schema initialization
- α System Core — middleware chain that runs after boot
- γ Server Lifecycle — shutdown sequence (mirror of boot)
- colibri-system.md — canonical vision including boot as a concept