Agent Handoff Protocol
When an agent completes a task, the next agent must be able to pick up cleanly. This document defines what the completing agent leaves behind and what the next agent reads first.
The 4-Stage Handoff
Stage 1 — Completing Agent Finishes Work
The executing agent:
- Completes the task implementation (writes code, creates files, makes changes)
- Runs full test suite:
npm test && npm run lint - Verifies all acceptance criteria are met (against task-prompts file)
- Creates a commit with clear message:
task: P0.X.Y — [title] - Pushes to the feature branch (e.g.,
feature/p0-x-y-slug)
Success signal: All tests pass, no lint errors, acceptance criteria verified.
Stage 2 — Completing Agent Writes Back
The completing agent uses the MCP writeback tools:
- Call
task_updatewith:task_id: e.g.,P0.1.1status:"done"progress:100notes: Brief summary (1-2 sentences)
- Call
thought_recordwith:task_id: e.g.,P0.1.1branch: e.g.,feature/p0-1-1-package-setupcommit_sha: Full commit SHAtests_run: Array of test names (e.g.,["smoke.test.ts", "eslint", "tsc"])summary: What was implemented (1 paragraph, 3-5 sentences)blockers: Empty array if clean; structured objects if blockers exist
- For proof-grade work, also run the full chain:
audit_session_start(marks start of auditable session)- Execute work (already done in Stage 1)
audit_verify_chain(verify hash chain integrity)thought_record(as above)merkle_finalize(seal proof into Merkle tree)merkle_root(retrieve final root hash)
Critical: Do not finalize the Merkle tree before writing the final thought record.
Stage 3 — PR Review & Merge
A reviewer agent (or human PM):
- Fetches the branch and reviews code
- Runs
npm test && npm run lintlocally to verify - Checks that the task-prompts file has a verification checklist — verifies all items
- Confirms that a
thought_recordwas written (by checking the AMS database or task log) - Approves and merges the PR to
main
Hard rule: If writeback is missing, the merge is rejected. The code may be perfect, but without writeback, the task is silently flipped back to in_progress and cannot be promoted to done.
Stage 4 — Next Agent Starts
The next agent (or continuation of the same agent on a new task):
- Reads CLAUDE.md — worktree rules and execution constraints
- Reads AGENTS.md — agent roles and role-specific entry points
- Runs
task_next_actions— retrieves unblocked todo tasks - Picks an unblocked task — preference for critical-path tasks first
- Creates a new worktree — follows the git worktree pattern from CLAUDE.md
- Reads the task-prompts file — e.g.,
docs/guides/implementation/task-prompts/p0.1-infrastructure.md - Copies the ready-to-paste prompt — cold-starts the new task with full context
The Handoff Packet
Everything the completing agent leaves behind. A precise specification:
task_update fields
| Field | Type | Purpose | Example |
|---|---|---|---|
task_id |
string | Task identifier | P0.1.1 |
status |
enum | Current state | done |
progress |
number | 0–100 | 100 |
notes |
string | Brief human summary | Package.json configured, ESLint passes, zero errors. |
thought_record fields
| Field | Type | Purpose | Example |
|---|---|---|---|
task_id |
string | Task identifier | P0.1.1 |
branch |
string | Feature branch name | feature/p0-1-1-package-setup |
commit_sha |
string | Full 40-char commit SHA | abc123def456... |
tests_run |
string[] | Test files / suites executed | ["smoke.test.ts", "eslint", "tsc --noEmit"] |
summary |
string | 1-paragraph narrative (3–5 sentences) | Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings. |
blockers |
object[] | Empty if clean; else structured blockers | [] or [{type: "peer-dep", package: "zod", issue: "...", resolution: "..."}] |
files_changed |
string[] | Paths created or modified | ["package.json", "tsconfig.json", ".eslintrc.json", ".prettierrc", ".env.example", ".gitignore", "src/__tests__/smoke.test.ts"] |
related_thought_records |
string[] | IDs of earlier records in dependency chain | [] (for P0.1.1, first in critical path) |
Complete example
{
"task_id": "P0.1.1",
"branch": "feature/p0-1-1-package-setup",
"commit_sha": "abc123def456789abcdef456789abcdef456789a",
"tests_run": ["smoke.test.ts", "eslint", "tsc --noEmit"],
"summary": "Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings.",
"blockers": [],
"files_changed": [
"package.json",
"tsconfig.json",
".eslintrc.json",
".prettierrc",
".env.example",
".gitignore",
"src/__tests__/smoke.test.ts"
],
"related_thought_records": []
}
What the Next Agent Reads First
Ordered list of reading assignments for the next agent:
- Latest thought_record for the project (gives current state, blockers, what was just done)
task_next_actionstool output (returns unblockedtodotasks in dependency order)- Task-prompts file for the chosen task (e.g.,
docs/guides/implementation/task-prompts/p0.1-infrastructure.md) - Extraction file referenced in the task-prompts (e.g.,
docs/reference/extractions/task-pipeline-algorithm.mdif needed) - CLAUDE.md — re-read every session (worktree rules, execution constraints, writeback contract)
- AGENTS.md — agent roles and role-specific entry points
- docs/guides/agent-bootstrap.md — master bootstrap prompt (for cold-start agents)
Writeback Enforcement
Writeback is not optional. It is a hard runtime block.
Rule: Any tool that moves a task from in_progress or todo to done must enforce writeback before returning success.
Enforcement mechanism: After the agent calls task_update with status="done":
- The AMS database checks: does a
thought_recordexist for this task_id with a recent timestamp? - If YES: permit the status change. Mark task as
donein the database. - If NO: reject the status change silently. The task stays
in_progress. The agent receives an error:"Writeback required: thought_record must be recorded before task completion."No exceptions.
What this prevents:
- Agents finishing work and disappearing without audit trail
- Orphaned tasks (code exists but no decision trail)
- Loss of accountability
What happens if writeback fails:
- Agent finishes coding, writes
task_update(status="done"), but DB write fails (connection error, etc.) - Agent receives error:
"Writeback failed: unable to insert thought_record. Retry with fresh DB connection or escalate." - Agent must retry writeback (with fresh connection) OR escalate to PM.
- The task stays
in_progressuntil writeback succeeds.
Failure Modes
Failure Mode 1: Agent A finishes, writeback fails
Scenario: Agent A completes task P0.1.1, calls thought_record, but database write times out.
Behavior:
- Agent receives:
Error: Writeback failed: unable to insert thought_record - Task status remains
in_progressin the database - Code is already committed to feature branch
Resolution:
- Agent retries writeback with fresh DB connection
- OR escalates to PM with: branch name, commit SHA, and the thought_record JSON
- PM manually writes the record via
thought_recordtool or database direct access
Failure Mode 2: Agent B picks a task A was working on
Scenario: Agent A is working on P0.1.1 (branch: feature/p0-1-1-package-setup). Agent B simultaneously picks P0.1.1.
Behavior:
- Agent B runs
git worktree add .worktrees/claude/p0-1-1-package-setup ... - Git detects branch already exists
- Error:
fatal: 'feature/p0-1-1-package-setup' already exists
Resolution:
- Agent B checks
task_next_actionsagain - Agent B picks a different unblocked task
- Once Agent A completes and merges, the worktree is cleaned up and available again
Failure Mode 3: Agent B picks a task with a reverted PR
Scenario: Agent A completes P0.1.1, PR is merged, then a later PR reverts it (merge conflict resolution, regression found). Task status in DB still says done.
Behavior:
- Agent B reads the latest thought_record:
status: done, files_changed: [...] - Agent B checks main branch: files don’t exist
- Agent B runs:
git log --oneline -n 20to see recent reverts
Resolution:
- Agent B or PM calls
thought_recordwith a new record:status: todo, notes: "Previous PR reverted, task reinstated" - Task is moved back to
todoin the database - Agent B (or next available agent) picks it up and re-executes
Failure Mode 4: Two agents finalize Merkle simultaneously
Scenario: Agent A and Agent B both call merkle_finalize at the exact same time after P0.2.2 completes.
Behavior:
- Merkle finalization is single-writer (SQLite lock)
- First writer wins, commits the proof tree
- Second writer receives:
Error: Merkle finalization in progress. Retry in 5s.
Resolution:
- Agent B retries
merkle_finalizeafter 5 seconds - Merkle tree is already finalized; the call becomes a no-op (idempotent)
- Both agents can safely continue
Diagram: Typical Handoff Sequence
sequenceDiagram
participant AgentA as Agent A<br/>(P0.1.1)
participant DB as AMS Database
participant AgentB as Agent B<br/>(P0.1.2)
AgentA->>AgentA: Code: P0.1.1
AgentA->>AgentA: npm test ✓
AgentA->>AgentA: git commit + push
AgentA->>DB: task_update(id=P0.1.1, status=done, progress=100)
AgentA->>DB: thought_record(id=P0.1.1, branch=..., commit_sha=..., summary=...)
DB-->>DB: Verify thought_record exists
DB-->>AgentA: ✓ Status updated to "done"
DB->>DB: Writeback check: thought_record ✓ exists
DB-->>AgentA: Merge permitted
AgentA->>AgentA: PR reviewed, merged to main
AgentB->>DB: task_next_actions()
DB-->>AgentB: [P0.1.2, P0.1.4, ...]
AgentB->>AgentB: Pick P0.1.2 (unblocked)
AgentB->>AgentB: Read docs/guides/implementation/task-prompts/p0.1-infrastructure.md
AgentB->>AgentB: git worktree add .worktrees/claude/p0-1-2-linter
AgentB->>AgentB: Code: P0.1.2
AgentB->>AgentB: npm test ✓
AgentB->>AgentB: Writeback + commit
AgentB->>DB: task_update(id=P0.1.2, status=done)
AgentB->>DB: thought_record(id=P0.1.2, ...)
DB-->>AgentB: ✓ Done
Why This Protocol Exists
The 28 Phase 0 tasks would take 6 weeks for a single human developer to ship. With proper handoff, a swarm of 5–8 agents can execute them in parallel once the critical path is laid down.
Key benefits:
- Auditability: Every task has a thought_record. No orphaned code.
- Parallelism: Next agent can start immediately after critical-path unblock (stage 4 before stage 3 completes if using async review).
- Accountability: Merkle proofs anchor every decision. Tamper detection is cryptographic.
- Recovery: Failed tasks are explicitly tracked. Rollback scenarios have decision trails.
- Memory: Each agent inherits the full context from the previous agent’s thought_record. No context loss.
Without this protocol, agent A finishes work and agent B has no way to know what was done, what was tested, or what failures occurred. With it, B can cold-start on any task and inherit all context instantly.
See Also
- CLAUDE.md — Execution rules and worktree pattern
- AGENTS.md — Agent roles and dispatch
- agent-bootstrap.md — Master bootstrap prompt
- docs/guides/implementation/task-prompts/index.md — Per-task agent prompts
- docs/guides/implementation/first-7-prs.md — Critical-path PR sequence