Hardening

This document is split into two parts:

  1. Input validation and sanitization (Phase 0, target). The src/security/ module family that Phase 0 will re-implement against the Zod 4 + sanitizer surface. The design shape is ported from the AMS donor; the code does not yet exist. Section: Input validation framework down through Integrity verification.
  2. Authentication, API keys, ACL, rate-limiting (Phase 2+, heritage). The AMS donor auth surface. Not shipped by Phase 0. Reproduced here as the target shape for Phase 2+ so the design decisions and env-var surface are in one place. Section: Auth modes down through Audit integration.

⚠ HERITAGE — auth is not Phase 0.

Every AMS_AUTH_* env var, ~/.ams/token path, ~/.ams/auth-secret path, JWT scheme, API-key format, ACL role matrix, and auth rate-limit table in the sections marked Phase 2+ below is imported from the pre-R53 AMS donor. Phase 0 Colibri has no authentication layer — the MCP transport runs over stdio with no token at all, because the transport itself isolates a single agent per process (see ../../2-plugin/boot.md and ADR-004).

When Phase 2+ ports this design into Colibri:

  • AMS_AUTH_* names will become COLIBRI_AUTH_* per s17 — Environment §3.
  • ~/.ams/ paths will become ~/.colibri/.
  • src/config-auth.js, src/claude/admin/api-keys.js, src/claude/admin/organization.js refer to the deleted donor tree. The Phase 2+ target path is src/domains/auth/.
  • The donor’s four-mode AMS_AUTH_MODE (trust/token/hybrid/required) is a starting point; the Phase 2+ design will decide whether to keep all four modes or collapse them.

Treat every AMS_AUTH_* table row below as a target shape, not a current-runtime contract. If you are looking for what Phase 0 actually enforces at the call boundary, the answer is: the 5-stage α chain in ../../2-plugin/middleware.md, which has neither JWT nor ACL.


Auth modes

(Phase 2+ target, ported from AMS donor. See heritage banner above.)

Source (pre-R53 donor, deleted): src/config-auth.js, src/claude/admin/api-keys.js, src/claude/admin/organization.js. Phase 2+ target path: src/domains/auth/.

Auth modes

Phase 2+ will read an env var (target name COLIBRI_AUTH_MODE; donor name AMS_AUTH_MODE is retained in tables below for continuity with the donor extraction) that controls enforcement level. Phase 0 does not read this variable:

Mode Behavior
trust No authentication (default, single-user desktop)
token Token auth optional; used if provided
hybrid Token auth preferred, falls back to legacy ACL
required Token auth mandatory; unauthenticated requests rejected

In the donor’s desktop mode, auto-detection triggered when no AMS_AUTH_SECRET or AMS_AUTH_SECRET_FILE was configured, and tokens were stored at ~/.ams/token with secrets at ~/.ams/auth-secret. Phase 2+ Colibri will migrate these to ~/.colibri/token and ~/.colibri/auth-secret. Phase 0 writes neither file.

Methods

Method Transport Use case
JWT (HMAC) HTTP User sessions, dashboard
API key HTTP Scripts, automation, CI
MCP identity stdio AI clients (authenticated by process isolation)

JWT implementation

Token structure

Tokens are signed with HMAC using the secret from AMS_AUTH_SECRET (base64url-encoded) or read from AMS_AUTH_SECRET_FILE. If AMS_AUTH_AUTO_GENERATE=true, the server generates a secret on first start.

Token claims:

Claim Source Default
iss AMS_AUTH_ISSUER ams-server
aud AMS_AUTH_AUDIENCE ams-mcp-tools
exp AMS_AUTH_TOKEN_LIFETIME 3600s (1 hour)
type AMS_AUTH_DEFAULT_TOKEN_TYPE access

Refresh tokens have a separate lifetime controlled by AMS_AUTH_REFRESH_LIFETIME (default: 604800s / 7 days).

Token flow

  1. Client sends credentials to /api/auth/login
  2. Server returns signed JWT with configured claims
  3. Client includes Authorization: Bearer <token> on subsequent requests
  4. Middleware validates signature, expiry, issuer, and audience on every request

Token revocation

Revocation is enabled by default (AMS_AUTH_REVOCATION_ENABLED). The revocation list is cleaned up on a configurable interval (AMS_AUTH_REVOCATION_CLEANUP_INTERVAL, default 1 hour) and caps at AMS_AUTH_REVOCATION_MAX_ENTRIES (default 10,000) before forced cleanup.

API key implementation

Key generation

Keys are generated as ams_<64-hex-chars> using crypto.randomBytes(32). Only the SHA-256 hash (crypto.createHash("sha256")) is persisted in the api_keys table. A preview field stores first8...last4 for identification without exposing the full key.

The plaintext key is returned exactly once at creation time. It cannot be recovered after that.

Validation flow

validateApiKey(plaintextKey) performs three checks:

  1. Hash the provided key and look up the hash in api_keys where status = 'active'
  2. Check expires_at against current time
  3. Verify status is not revoked

On success, last_used_at is updated. Returns { valid: true, apiKey } or { valid: false, reason }.

Key rotation

rotateApiKey(id, rotatedBy, options) creates a new key inheriting the old key’s permissions, scopes, and rate limits. The old key is marked revoked (default) or rotated. Both events are logged to api_key_audit_log. The rotation is atomic: if creation fails, the old key remains active.

Permission sets

read_only:  [read]
read_write: [read, write]
admin:      [read, write, delete, admin]
full:       [read, write, delete, admin, super]

apiKeyHasPermission(apiKey, permission) and apiKeyHasScope(apiKey, scope) check array membership. Permissions and scopes are stored as JSON arrays in the api_keys table.

Usage tracking

Every API call records an entry in api_key_usage with endpoint, method, status code, latency (ms), IP address, and timestamp. getApiKeyUsage() aggregates daily request counts and latency statistics. getUsageSummary() provides organization-wide breakdowns.

Multi-actor auth

Both JWT and API key can be active simultaneously. The middleware resolves the caller identity from whichever is present, with JWT taking priority. The resolved identity carries a role that feeds into ACL checks.

ACL (Access Control)

Organization-level permissions

The ORG_PERMISSIONS map in organization.js defines 14 discrete permissions:

Permission Allowed roles
org_read owner, admin, member, viewer
org_update owner, admin
org_delete owner
member_add owner, admin
member_remove owner, admin
member_role_update owner, admin
workspace_create owner, admin, member
workspace_read owner, admin, member, viewer
workspace_update owner, admin
workspace_delete owner, admin
api_key_create owner, admin
api_key_read_own owner, admin, member, viewer
api_key_read_all owner, admin
api_key_revoke owner, admin

checkOrgAccess(orgId, userId, requiredRole) compares the user’s numeric role level (owner=4, admin=3, member=2, viewer=1) against the required level.

Tool-level ACL

Each tool has an ACL entry defining which roles can call it. The ACL middleware checks on every tool call before execution.

Default roles:

  • admin – all tools
  • executor – task/GSD/thought/memory/merkle tools
  • operator – read-only tools + sync
  • viewer – read-only tools only

Rate limiting

Auth-specific rate limits

Setting Env var Default
Max validation attempts per window AMS_AUTH_RATE_LIMIT_MAX 100
Rate limit window AMS_AUTH_RATE_LIMIT_WINDOW 60000ms (1 min)

API key rate limits

Per-key rate limits are stored in the rate_limit JSON field with configurable thresholds:

Limit Default
requests_per_minute 100
requests_per_hour 1,000
requests_per_day 10,000

checkRateLimit(keyId) queries recent usage from api_key_usage and returns { allowed: true, remaining: { minute, hour } } or { allowed: false, reason, retry_after }.

Audit integration

Auth events are logged to the audit system when AMS_AUTH_AUDIT_ENABLED=true (default). Configuration:

Setting Env var Default
Log successful auth AMS_AUTH_AUDIT_SUCCESS false
Log failed auth AMS_AUTH_AUDIT_FAILURES true

API key audit logs track every lifecycle event (created, updated, revoked, rotated) with performed_by, details (JSON), and timestamp. getOrgAuditLog() aggregates across all keys in an organization.


Phase 0 hardening surface — input validation & sanitization

The remainder of this document describes the Phase 0 target for input hardening: the Zod 4 validation framework the α chain’s schema-validate stage enforces (../../2-plugin/middleware.md Stage 2), plus the sanitizer and injection detectors that Phase 0 ships as part of src/domains/system/ (target path). The design is ported from the AMS donor (src/security/*.js, deleted R53); the Phase 0 re-implementation targets are src/security/validator.ts, src/security/sanitizer.ts, src/security/audit.ts.

Unlike the auth sections above, this content IS Phase 0 scope — it is what P0.2.4 delivers alongside the middleware chain.

Node.js server controls

Control Implementation
Input validation Zod schemas on every tool input
Input sanitization Context-aware deep sanitization (HTML, SQL, shell, email, URL)
Injection detection SQL, command, path traversal, NoSQL, XPath, LDAP, XXE, SSTI
Secret scanning 8 secret patterns with severity classification
Audit trail Every tool call logged (caller, params hash, result hash, duration)
Rate limiting Per-caller token bucket + per-API-key limits
Circuit breaker Auto-trips on repeated failures, prevents cascade
Error isolation Global unhandled rejection/exception handlers
Stdout protection MCP JSON-RPC on stdout; all logging to stderr
Timeout Configurable per-tool execution timeout

Input validation framework

src/security/validator.js provides Zod-based validation with integrated security scanning.

Schema helpers

sanitizedString(options) builds a Zod string schema with optional constraints: min, max, pattern, email, url, trim, nonempty.

withSecurity(baseSchema, options) wraps any Zod schema with superRefine checks that reject SQL injection (confidence > 0.7) and critical secrets in string values.

Tool input validation

validateToolInput(toolName, args, schema) runs three layers:

  1. Schema validation – Zod safeParse against the tool’s schema
  2. Injection detection – SQL injection (confidence > 0.6), path traversal (on path/file/dir fields), command injection (confidence > 0.5)
  3. Secret scanning – warnings for detected secrets (does not block)

Returns { valid, errors, warnings, sanitized }.

Specialized validators

Validator Checks
validateId(id) Max length (100), alphanumeric + _- pattern, path traversal detection
validateFilePath(path) Path traversal, null bytes, blocked extensions (.exe, .bat, .cmd, .sh, .dll, .so)
validateJson(content) Parse validity, max depth (10), max keys (1000)

Custom error types

  • ValidationError – carries details array and code: "VALIDATION_ERROR"
  • SecurityError – carries threatType, confidence, and code: "SECURITY_VIOLATION"

Validation rate limiter

ValidationRateLimiter tracks attempts per key within a time window (default: 100 attempts per 60 seconds). Exceeding the limit blocks the key entirely until reset() is called. A global instance globalRateLimiter is exported for shared use.

Input sanitization

src/security/sanitizer.js provides context-aware sanitization for different output contexts.

Context-specific sanitizers

| Context | Function | Strategy | |———|———-|———-| | HTML | escapeHtml() | Replace & < > " ' / with HTML entities | | SQL | sanitizeForSql() | Remove null bytes, double single quotes, escape backslashes | | Shell | sanitizeShellArg() | Strip ; & | \ $ ( ) { } [ ] \n \r | | Email | sanitizeEmail() | Lowercase, trim, allow only a-z 0-9 . @ _ + - | | URL | sanitizeUrl() | Parse with URL(), allow only http/https, strip credentials | | Path | sanitizePath() | Detect traversal, normalize separators, strip control chars, enforce base directory | | Strict | stripDangerousChars() | Remove < > & “ ‘ / `` |

Deep sanitization

deepSanitize(data, { context }) recursively sanitizes all values in an object/array tree. Object keys are sanitized to prevent prototype pollution (only a-zA-Z0-9_$ allowed in key names).

Tool input sanitization

sanitizeToolInput(input) applies context-aware sanitization based on field name heuristics:

  • Fields containing path, file, or dir – path sanitization with traversal detection
  • Fields containing email – email sanitization
  • Fields containing url or link – URL sanitization with protocol restriction
  • Numbers – clamped to Number.MAX_SAFE_INTEGER / MIN_SAFE_INTEGER; non-finite values converted to 0
  • Nested objects/arrays – recursive sanitization

Returns { success, sanitized, warnings, errors }.

Safe filename generation

createSafeFilename(filename) replaces invalid characters with underscores, collapses runs, trims edges, and caps at 200 characters. Extensions are lowercased and restricted to alphanumeric characters.

Injection detection

src/security/audit.js and src/security/audit-comprehensive.js implement multi-vector injection detection.

SQL injection

12+ regex patterns covering:

  • Classic injection (', --, #, %27)
  • UNION-based (UNION SELECT)
  • Tautology (' OR '1'='1)
  • Stacked queries (; separators)
  • Time-based blind (SLEEP(), BENCHMARK(), WAITFOR DELAY, PG_SLEEP)
  • Error-based (CONVERT(), CAST())
  • Out-of-band (LOAD_FILE(), INTO OUTFILE)

Confidence scoring: each matching pattern adds 0.2; each SQL keyword found adds 0.1. Capped at 1.0.

Additional injection types (comprehensive scanner)

Type Pattern count Examples
NoSQL 13 $where, $regex, $ne, $gt, $or
XPath 6 ' or '1'='1, count(/), position()
LDAP 6 *), *)(, (|(
XXE 10 <!ENTITY, SYSTEM, file://, php://
SSTI 8 {{...}}, {%...%}, ${...}, <%=...%>

Path traversal

10 patterns detecting ../, ..\, URL-encoded variants (%2e%2e%2f), double-encoded variants (%252e%252e%252f), and null byte injection (\0, %00).

Command injection

7 patterns detecting shell metacharacters (;, &, |, backticks), subshell syntax ($(...), backtick-commands), and dangerous commands (rm, cat, echo after semicolons).

Secret scanning

scanForSecrets(content) checks against 8 secret patterns:

Type Severity Pattern example
API key critical api_key = "AKIA..."
Secret key critical secret_key = "..."
Private key critical -----BEGIN PRIVATE KEY-----
Password high password = "..."
Token high token = "eyJ..."
AWS key critical AKIA[0-9A-Z]{16}
GitHub token critical gh[pousr]_[A-Za-z0-9_]{36,}
Slack token critical xox[baprs]-...

Exclusions: strings containing example, test, fake, mock, dummy are skipped. The comprehensive scanner adds CWE identifiers to each finding (e.g. CWE-798 for hardcoded credentials).

Comprehensive input audit

auditInput(data) runs all detectors against every string field in the input object and produces a risk score:

Finding Risk points
Critical secret +25
Any injection detected +15

Score is capped at 100. passed is false if any finding is detected.

createAuditReport(auditLogs) aggregates multiple audit results into a summary with pass/fail counts, risk distribution (low/medium/high/critical), finding type counts, and auto-generated recommendations.

Integrity verification

calculateIntegrityHash(content) returns a SHA-256 hex digest for content integrity verification, used throughout the audit and Merkle proof systems.

P2P layer (Phase 3+ target)

⚠ Not Phase 0. P2P / multi-node consensus is Phase 3+ territory. The table below is the design shape for src/domains/p2p/ (target path, not scheduled); Phase 0 is single-node, single-process, and enforces none of these controls at the transport layer.

Control Implementation Spec
Merkle integrity All events in Merkle tree with inclusion proofs s13 HS-01
Rate limiting Token bucket per identity per epoch s13 HS-02
VRF audit 5-20% probabilistic deep verification s13 HS-02
Tenant isolation Row-level security by tenant_id s13 HS-03
Finality windows >=24h dispute window, no irreversible effects before HARD s13 HS-04
Key recovery Shamir’s Secret Sharing (5/3 or 7/5) s13 HS-05
Signature verification Ed25519 on every event s06
Equivocation detection Conflicting signatures -> votes invalidated, reputation penalty s06
Clock drift protection Signed Time Anchors, deprioritize drifted nodes s08

Threat model

(Phase 3+ target — the Phase 0 threat model is narrower, since there is neither a network transport nor a multi-actor surface. Included here for continuity with the donor design.)

Assume hostile network, compromised device, phishable user. Design for:

  • MITM – all messages signed (Ed25519)
  • Replay – event IDs include nonce + timestamp
  • Eclipse – gossip with multiple peers, adaptive fanout
  • Key theft – Shamir recovery, scar on compromised identity
  • Social engineering – commit-reveal voting prevents vote copying

Cross-references


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.