The Six Promises

Colibri’s existence is justified by six promises the runtime makes to every caller. They are not features; they are invariants. If any one is broken, the system stops being Colibri and becomes something weaker — a script runner, a log collector, or a decoration over LLM calls.

Each promise is enforced by specific Greek-letter concepts. As of R75 Wave H, 7 concepts ship code at colibri_code: partial (α β γ ε ζ η ν) and 8 remain colibri_code: none (δ deferred per ADR-005; θ ι κ λ μ ξ π spec-only for later phases). The promises below are enforced today wherever the enforcing concept ships, and planned for later phases where the enforcing concept is still spec-only.

1. Every mutation is gated

No state changes without passing the α middleware chain: tool-lock → schema-validate → audit-enter → dispatch → audit-exit. Five stages, same order, no exceptions, no bypass path for “trusted” callers.

Who enforces: α (middleware chain). See 2-plugin/middleware.md.
What this rules out: direct database writes, side-channel mutations, admin back doors, “quick fix” handlers that skip validation.

2. Every task leaves a trail

A task that starts must either finish with a proof-anchored record, or be explicitly cancelled. There is no orphan state — no “it ran, we don’t know what happened.”

Who enforces: β (task pipeline) + ζ (decision trail). See 3-world/execution/task-pipeline.md and 3-world/execution/decision-trail.md.
FSM states: INIT → GATHER → ANALYZE → PLAN → APPLY → VERIFY → DONE, plus CANCELLED as a terminal.
Writeback rule: task_update(status=done) + final thought_record are mandatory before a task is counted complete. In Phase 0 this is enforced at convention level (orphan scan); runtime hard-block is a later phase.

3. Every claim is auditable

Every decision, every reflection, every verification step is recorded as a thought_record whose hash participates in a chain: chain_hash = SHA-256(content_hash + parent_chain_hash). No record is forgeable without breaking SHA-256.

Who enforces: ζ (decision trail) + η (proof store). See 3-world/physics/laws/proof-store.md.
Four record types: plan, analysis, decision, reflection.
Merkle anchor: records are sealed into a Merkle tree; the root is returned by merkle_root and pinned into the next round’s genesis.

4. Order survives the network

When multiple arbiters see the same events, they agree on a single order — or they mark it unresolved. No silent reordering, no “last writer wins.”

Who enforces (spec-only in Phase 0): θ (consensus), κ (rule engine), λ (reputation). See 3-world/physics/laws/consensus.md.
Phase 0 reality: single-writer SQLite with WAL. The consensus promise is specified but not activated; a lone-arbiter run is trivially consistent.
Quorum rule (future): quorum = floor(2n/3) + 1, tolerating f < n/3 byzantine arbiters.

5. Skills are discoverable, not hardcoded

Agent capabilities are declared as skill directories with a SKILL.md contract and an optional scripts/ bundle. The registry knows what exists and where; it does not decide what agents are allowed to want.

Who enforces: β (dispatch layer) + ε (skill registry). See 3-world/execution/skill-registry.md.
Phase 0 spawning: via the Claude Task tool. Multi-agent runtime pools are deferred to Phase 1.5 per ADR-005.

6. Identity is owned, not rented

An agent’s identity — its Soul Vector — is bound to an Ed25519 keypair whose private half is split across Shamir shares. No central registry can impersonate an agent; no one operator can recover an identity alone.

Who enforces (spec-only in Phase 0): ξ (identity) + λ (reputation). See 3-world/social/identity.md.
Shamir parameters: N = 5, M = 3 (3-of-5 recovery).
Phase 0 reality: keys are scaffolded but not load-bearing; the rest of the runtime still works with plain task records.

Enforcement matrix

Each promise below names the mechanism that makes it real. Phase 0 enforces rows 1/2/3 at runtime; rows 4/5/6/7 activate as later-phase concepts land. “Rule” columns cite the spec section that carries the obligation.

Promise	Enforcement (how it’s guaranteed)
Every tool call is audited	α chain `audit-enter` / `audit-exit` stages write to `audit_events` (`spec/s17` §4); middleware failure = tool rejection.
No silent rule application	ζ records `thought_type: "rule_applied"` with full `rule_id` + inputs + outputs (`spec/s02`).
Writeback after every task	β FSM does not advance to `DONE` without a matching `thought_record` within 60s of `task_update(status="done")`; convention in Phase 0, hard block in Phase 1 (`spec/s15`).
Determinism of outcomes	κ rule engine forbids float / clock / rand (`spec/s11` §forbidden-ops); test-corpus byte-identity required for rule updates.
Merkle anchor per session	η `merkle_finalize` at session close; root hash logged to ζ + stored in `mcp_merkle_roots` (`spec/s13` HS-01).
Identity persists across forks	ι fork shares Ed25519 keypair; `soul_id = SHA-256(initial_pubkey)` stable (`spec/s18`).
Right to exit	ι voluntary-exit fork; penalty capped at 10% per AX-06 (`spec/s01` §axioms).

Phase 0 reality stamp

All six promises are specified. Two and a half — α (gating), β (task FSM), and ζ (decision trail) — will be enforced by the Phase 0 build. Promises 4 and 6 are single-writer-trivial in Phase 0 and gain real teeth in later phases. Promise 3 (audit chain) is fully load-bearing from day one: three tables — thought_records, merkle_nodes, audit_events — carry it.

Do not quote any of these promises as a running capability until the corresponding TypeScript lands in src/.