Hardening
This document is split into two parts:
- Input validation and sanitization (Phase 0, target). The
src/security/module family that Phase 0 will re-implement against the Zod 4 + sanitizer surface. The design shape is ported from the AMS donor; the code does not yet exist. Section: Input validation framework down through Integrity verification. - Authentication, API keys, ACL, rate-limiting (Phase 2+, heritage). The AMS donor auth surface. Not shipped by Phase 0. Reproduced here as the target shape for Phase 2+ so the design decisions and env-var surface are in one place. Section: Auth modes down through Audit integration.
⚠ HERITAGE — auth is not Phase 0.
Every
AMS_AUTH_*env var,~/.ams/tokenpath,~/.ams/auth-secretpath, JWT scheme, API-key format, ACL role matrix, and auth rate-limit table in the sections marked Phase 2+ below is imported from the pre-R53 AMS donor. Phase 0 Colibri has no authentication layer — the MCP transport runs over stdio with no token at all, because the transport itself isolates a single agent per process (see../../2-plugin/boot.mdand ADR-004).When Phase 2+ ports this design into Colibri:
AMS_AUTH_*names will becomeCOLIBRI_AUTH_*per s17 — Environment §3.~/.ams/paths will become~/.colibri/.src/config-auth.js,src/claude/admin/api-keys.js,src/claude/admin/organization.jsrefer to the deleted donor tree. The Phase 2+ target path issrc/domains/auth/.- The donor’s four-mode
AMS_AUTH_MODE(trust/token/hybrid/required) is a starting point; the Phase 2+ design will decide whether to keep all four modes or collapse them.Treat every
AMS_AUTH_*table row below as a target shape, not a current-runtime contract. If you are looking for what Phase 0 actually enforces at the call boundary, the answer is: the 5-stage α chain in../../2-plugin/middleware.md, which has neither JWT nor ACL.
Auth modes
(Phase 2+ target, ported from AMS donor. See heritage banner above.)
Source (pre-R53 donor, deleted): src/config-auth.js, src/claude/admin/api-keys.js, src/claude/admin/organization.js. Phase 2+ target path: src/domains/auth/.
Auth modes
Phase 2+ will read an env var (target name COLIBRI_AUTH_MODE; donor name AMS_AUTH_MODE is retained in tables below for continuity with the donor extraction) that controls enforcement level. Phase 0 does not read this variable:
| Mode | Behavior |
|---|---|
trust |
No authentication (default, single-user desktop) |
token |
Token auth optional; used if provided |
hybrid |
Token auth preferred, falls back to legacy ACL |
required |
Token auth mandatory; unauthenticated requests rejected |
In the donor’s desktop mode, auto-detection triggered when no AMS_AUTH_SECRET or AMS_AUTH_SECRET_FILE was configured, and tokens were stored at ~/.ams/token with secrets at ~/.ams/auth-secret. Phase 2+ Colibri will migrate these to ~/.colibri/token and ~/.colibri/auth-secret. Phase 0 writes neither file.
Methods
| Method | Transport | Use case |
|---|---|---|
| JWT (HMAC) | HTTP | User sessions, dashboard |
| API key | HTTP | Scripts, automation, CI |
| MCP identity | stdio | AI clients (authenticated by process isolation) |
JWT implementation
Token structure
Tokens are signed with HMAC using the secret from AMS_AUTH_SECRET (base64url-encoded) or read from AMS_AUTH_SECRET_FILE. If AMS_AUTH_AUTO_GENERATE=true, the server generates a secret on first start.
Token claims:
| Claim | Source | Default |
|---|---|---|
iss |
AMS_AUTH_ISSUER |
ams-server |
aud |
AMS_AUTH_AUDIENCE |
ams-mcp-tools |
exp |
AMS_AUTH_TOKEN_LIFETIME |
3600s (1 hour) |
type |
AMS_AUTH_DEFAULT_TOKEN_TYPE |
access |
Refresh tokens have a separate lifetime controlled by AMS_AUTH_REFRESH_LIFETIME (default: 604800s / 7 days).
Token flow
- Client sends credentials to
/api/auth/login - Server returns signed JWT with configured claims
- Client includes
Authorization: Bearer <token>on subsequent requests - Middleware validates signature, expiry, issuer, and audience on every request
Token revocation
Revocation is enabled by default (AMS_AUTH_REVOCATION_ENABLED). The revocation list is cleaned up on a configurable interval (AMS_AUTH_REVOCATION_CLEANUP_INTERVAL, default 1 hour) and caps at AMS_AUTH_REVOCATION_MAX_ENTRIES (default 10,000) before forced cleanup.
API key implementation
Key generation
Keys are generated as ams_<64-hex-chars> using crypto.randomBytes(32). Only the SHA-256 hash (crypto.createHash("sha256")) is persisted in the api_keys table. A preview field stores first8...last4 for identification without exposing the full key.
The plaintext key is returned exactly once at creation time. It cannot be recovered after that.
Validation flow
validateApiKey(plaintextKey) performs three checks:
- Hash the provided key and look up the hash in
api_keyswherestatus = 'active' - Check
expires_atagainst current time - Verify status is not
revoked
On success, last_used_at is updated. Returns { valid: true, apiKey } or { valid: false, reason }.
Key rotation
rotateApiKey(id, rotatedBy, options) creates a new key inheriting the old key’s permissions, scopes, and rate limits. The old key is marked revoked (default) or rotated. Both events are logged to api_key_audit_log. The rotation is atomic: if creation fails, the old key remains active.
Permission sets
read_only: [read]
read_write: [read, write]
admin: [read, write, delete, admin]
full: [read, write, delete, admin, super]
apiKeyHasPermission(apiKey, permission) and apiKeyHasScope(apiKey, scope) check array membership. Permissions and scopes are stored as JSON arrays in the api_keys table.
Usage tracking
Every API call records an entry in api_key_usage with endpoint, method, status code, latency (ms), IP address, and timestamp. getApiKeyUsage() aggregates daily request counts and latency statistics. getUsageSummary() provides organization-wide breakdowns.
Multi-actor auth
Both JWT and API key can be active simultaneously. The middleware resolves the caller identity from whichever is present, with JWT taking priority. The resolved identity carries a role that feeds into ACL checks.
ACL (Access Control)
Organization-level permissions
The ORG_PERMISSIONS map in organization.js defines 14 discrete permissions:
| Permission | Allowed roles |
|---|---|
| org_read | owner, admin, member, viewer |
| org_update | owner, admin |
| org_delete | owner |
| member_add | owner, admin |
| member_remove | owner, admin |
| member_role_update | owner, admin |
| workspace_create | owner, admin, member |
| workspace_read | owner, admin, member, viewer |
| workspace_update | owner, admin |
| workspace_delete | owner, admin |
| api_key_create | owner, admin |
| api_key_read_own | owner, admin, member, viewer |
| api_key_read_all | owner, admin |
| api_key_revoke | owner, admin |
checkOrgAccess(orgId, userId, requiredRole) compares the user’s numeric role level (owner=4, admin=3, member=2, viewer=1) against the required level.
Tool-level ACL
Each tool has an ACL entry defining which roles can call it. The ACL middleware checks on every tool call before execution.
Default roles:
- admin – all tools
- executor – task/GSD/thought/memory/merkle tools
- operator – read-only tools + sync
- viewer – read-only tools only
Rate limiting
Auth-specific rate limits
| Setting | Env var | Default |
|---|---|---|
| Max validation attempts per window | AMS_AUTH_RATE_LIMIT_MAX |
100 |
| Rate limit window | AMS_AUTH_RATE_LIMIT_WINDOW |
60000ms (1 min) |
API key rate limits
Per-key rate limits are stored in the rate_limit JSON field with configurable thresholds:
| Limit | Default |
|---|---|
| requests_per_minute | 100 |
| requests_per_hour | 1,000 |
| requests_per_day | 10,000 |
checkRateLimit(keyId) queries recent usage from api_key_usage and returns { allowed: true, remaining: { minute, hour } } or { allowed: false, reason, retry_after }.
Audit integration
Auth events are logged to the audit system when AMS_AUTH_AUDIT_ENABLED=true (default). Configuration:
| Setting | Env var | Default |
|---|---|---|
| Log successful auth | AMS_AUTH_AUDIT_SUCCESS |
false |
| Log failed auth | AMS_AUTH_AUDIT_FAILURES |
true |
API key audit logs track every lifecycle event (created, updated, revoked, rotated) with performed_by, details (JSON), and timestamp. getOrgAuditLog() aggregates across all keys in an organization.
Phase 0 hardening surface — input validation & sanitization
The remainder of this document describes the Phase 0 target for input hardening: the Zod 4 validation framework the α chain’s schema-validate stage enforces (../../2-plugin/middleware.md Stage 2), plus the sanitizer and injection detectors that Phase 0 ships as part of src/domains/system/ (target path). The design is ported from the AMS donor (src/security/*.js, deleted R53); the Phase 0 re-implementation targets are src/security/validator.ts, src/security/sanitizer.ts, src/security/audit.ts.
Unlike the auth sections above, this content IS Phase 0 scope — it is what P0.2.4 delivers alongside the middleware chain.
Node.js server controls
| Control | Implementation |
|---|---|
| Input validation | Zod schemas on every tool input |
| Input sanitization | Context-aware deep sanitization (HTML, SQL, shell, email, URL) |
| Injection detection | SQL, command, path traversal, NoSQL, XPath, LDAP, XXE, SSTI |
| Secret scanning | 8 secret patterns with severity classification |
| Audit trail | Every tool call logged (caller, params hash, result hash, duration) |
| Rate limiting | Per-caller token bucket + per-API-key limits |
| Circuit breaker | Auto-trips on repeated failures, prevents cascade |
| Error isolation | Global unhandled rejection/exception handlers |
| Stdout protection | MCP JSON-RPC on stdout; all logging to stderr |
| Timeout | Configurable per-tool execution timeout |
Input validation framework
src/security/validator.js provides Zod-based validation with integrated security scanning.
Schema helpers
sanitizedString(options) builds a Zod string schema with optional constraints: min, max, pattern, email, url, trim, nonempty.
withSecurity(baseSchema, options) wraps any Zod schema with superRefine checks that reject SQL injection (confidence > 0.7) and critical secrets in string values.
Tool input validation
validateToolInput(toolName, args, schema) runs three layers:
- Schema validation – Zod
safeParseagainst the tool’s schema - Injection detection – SQL injection (confidence > 0.6), path traversal (on path/file/dir fields), command injection (confidence > 0.5)
- Secret scanning – warnings for detected secrets (does not block)
Returns { valid, errors, warnings, sanitized }.
Specialized validators
| Validator | Checks |
|---|---|
validateId(id) |
Max length (100), alphanumeric + _- pattern, path traversal detection |
validateFilePath(path) |
Path traversal, null bytes, blocked extensions (.exe, .bat, .cmd, .sh, .dll, .so) |
validateJson(content) |
Parse validity, max depth (10), max keys (1000) |
Custom error types
ValidationError– carriesdetailsarray andcode: "VALIDATION_ERROR"SecurityError– carriesthreatType,confidence, andcode: "SECURITY_VIOLATION"
Validation rate limiter
ValidationRateLimiter tracks attempts per key within a time window (default: 100 attempts per 60 seconds). Exceeding the limit blocks the key entirely until reset() is called. A global instance globalRateLimiter is exported for shared use.
Input sanitization
src/security/sanitizer.js provides context-aware sanitization for different output contexts.
Context-specific sanitizers
| Context | Function | Strategy |
|———|———-|———-|
| HTML | escapeHtml() | Replace & < > " ' / with HTML entities |
| SQL | sanitizeForSql() | Remove null bytes, double single quotes, escape backslashes |
| Shell | sanitizeShellArg() | Strip ; & | \ $ ( ) { } [ ] \n \r |
| Email | sanitizeEmail() | Lowercase, trim, allow only a-z 0-9 . @ _ + - |
| URL | sanitizeUrl() | Parse with URL(), allow only http/https, strip credentials |
| Path | sanitizePath() | Detect traversal, normalize separators, strip control chars, enforce base directory |
| Strict | stripDangerousChars() | Remove < > & “ ‘ / `` |
Deep sanitization
deepSanitize(data, { context }) recursively sanitizes all values in an object/array tree. Object keys are sanitized to prevent prototype pollution (only a-zA-Z0-9_$ allowed in key names).
Tool input sanitization
sanitizeToolInput(input) applies context-aware sanitization based on field name heuristics:
- Fields containing
path,file, ordir– path sanitization with traversal detection - Fields containing
email– email sanitization - Fields containing
urlorlink– URL sanitization with protocol restriction - Numbers – clamped to
Number.MAX_SAFE_INTEGER/MIN_SAFE_INTEGER; non-finite values converted to 0 - Nested objects/arrays – recursive sanitization
Returns { success, sanitized, warnings, errors }.
Safe filename generation
createSafeFilename(filename) replaces invalid characters with underscores, collapses runs, trims edges, and caps at 200 characters. Extensions are lowercased and restricted to alphanumeric characters.
Injection detection
src/security/audit.js and src/security/audit-comprehensive.js implement multi-vector injection detection.
SQL injection
12+ regex patterns covering:
- Classic injection (
',--,#,%27) - UNION-based (
UNION SELECT) - Tautology (
' OR '1'='1) - Stacked queries (
;separators) - Time-based blind (
SLEEP(),BENCHMARK(),WAITFOR DELAY,PG_SLEEP) - Error-based (
CONVERT(),CAST()) - Out-of-band (
LOAD_FILE(),INTO OUTFILE)
Confidence scoring: each matching pattern adds 0.2; each SQL keyword found adds 0.1. Capped at 1.0.
Additional injection types (comprehensive scanner)
| Type | Pattern count | Examples |
|---|---|---|
| NoSQL | 13 | $where, $regex, $ne, $gt, $or |
| XPath | 6 | ' or '1'='1, count(/), position() |
| LDAP | 6 | *), *)(, (|( |
| XXE | 10 | <!ENTITY, SYSTEM, file://, php:// |
| SSTI | 8 | {{...}}, {%...%}, ${...}, <%=...%> |
Path traversal
10 patterns detecting ../, ..\, URL-encoded variants (%2e%2e%2f), double-encoded variants (%252e%252e%252f), and null byte injection (\0, %00).
Command injection
7 patterns detecting shell metacharacters (;, &, |, backticks), subshell syntax ($(...), backtick-commands), and dangerous commands (rm, cat, echo after semicolons).
Secret scanning
scanForSecrets(content) checks against 8 secret patterns:
| Type | Severity | Pattern example |
|---|---|---|
| API key | critical | api_key = "AKIA..." |
| Secret key | critical | secret_key = "..." |
| Private key | critical | -----BEGIN PRIVATE KEY----- |
| Password | high | password = "..." |
| Token | high | token = "eyJ..." |
| AWS key | critical | AKIA[0-9A-Z]{16} |
| GitHub token | critical | gh[pousr]_[A-Za-z0-9_]{36,} |
| Slack token | critical | xox[baprs]-... |
Exclusions: strings containing example, test, fake, mock, dummy are skipped. The comprehensive scanner adds CWE identifiers to each finding (e.g. CWE-798 for hardcoded credentials).
Comprehensive input audit
auditInput(data) runs all detectors against every string field in the input object and produces a risk score:
| Finding | Risk points |
|---|---|
| Critical secret | +25 |
| Any injection detected | +15 |
Score is capped at 100. passed is false if any finding is detected.
createAuditReport(auditLogs) aggregates multiple audit results into a summary with pass/fail counts, risk distribution (low/medium/high/critical), finding type counts, and auto-generated recommendations.
Integrity verification
calculateIntegrityHash(content) returns a SHA-256 hex digest for content integrity verification, used throughout the audit and Merkle proof systems.
P2P layer (Phase 3+ target)
⚠ Not Phase 0. P2P / multi-node consensus is Phase 3+ territory. The table below is the design shape for
src/domains/p2p/(target path, not scheduled); Phase 0 is single-node, single-process, and enforces none of these controls at the transport layer.
| Control | Implementation | Spec |
|---|---|---|
| Merkle integrity | All events in Merkle tree with inclusion proofs | s13 HS-01 |
| Rate limiting | Token bucket per identity per epoch | s13 HS-02 |
| VRF audit | 5-20% probabilistic deep verification | s13 HS-02 |
| Tenant isolation | Row-level security by tenant_id | s13 HS-03 |
| Finality windows | >=24h dispute window, no irreversible effects before HARD | s13 HS-04 |
| Key recovery | Shamir’s Secret Sharing (5/3 or 7/5) | s13 HS-05 |
| Signature verification | Ed25519 on every event | s06 |
| Equivocation detection | Conflicting signatures -> votes invalidated, reputation penalty | s06 |
| Clock drift protection | Signed Time Anchors, deprioritize drifted nodes | s08 |
Threat model
(Phase 3+ target — the Phase 0 threat model is narrower, since there is neither a network transport nor a multi-actor surface. Included here for continuity with the donor design.)
Assume hostile network, compromised device, phishable user. Design for:
- MITM – all messages signed (Ed25519)
- Replay – event IDs include nonce + timestamp
- Eclipse – gossip with multiple peers, adaptive fanout
- Key theft – Shamir recovery, scar on compromised identity
- Social engineering – commit-reveal voting prevents vote copying
Cross-references
- Middleware — α chain — the Stage 2
schema-validateenforcement point the Phase 0 validator plugs into - μ — Integrity Monitor — circuit-breaker + audit-drift detection (Phase 4)
- s13 — Hardening, s06 — Correlation, s17 — Environment §3 (auth env namespace)
- ADR-004 — Tool Surface