Agent Handoff Protocol

When an agent completes a task, the next agent must be able to pick up cleanly. This document defines what the completing agent leaves behind and what the next agent reads first.

The 4-Stage Handoff

Stage 1 — Completing Agent Finishes Work

The executing agent:

Completes the task implementation (writes code, creates files, makes changes)
Runs full test suite: npm test && npm run lint
Verifies all acceptance criteria are met (against task-prompts file)
Creates a commit with clear message: task: P0.X.Y — [title]
Pushes to the feature branch (e.g., feature/p0-x-y-slug)

Success signal: All tests pass, no lint errors, acceptance criteria verified.

Stage 2 — Completing Agent Writes Back

The completing agent uses the MCP writeback tools:

Call task_update with:
- task_id: e.g., P0.1.1
- status: "done"
- progress: 100
- notes: Brief summary (1-2 sentences)
Call thought_record with:
- task_id: e.g., P0.1.1
- branch: e.g., feature/p0-1-1-package-setup
- commit_sha: Full commit SHA
- tests_run: Array of test names (e.g., ["smoke.test.ts", "eslint", "tsc"])
- summary: What was implemented (1 paragraph, 3-5 sentences)
- blockers: Empty array if clean; structured objects if blockers exist
For proof-grade work, also run the full chain:
- audit_session_start (marks start of auditable session)
- Execute work (already done in Stage 1)
- audit_verify_chain (verify hash chain integrity)
- thought_record (as above)
- merkle_finalize (seal proof into Merkle tree)
- merkle_root (retrieve final root hash)

Critical: Do not finalize the Merkle tree before writing the final thought record.

Stage 3 — PR Review & Merge

A reviewer agent (or human PM):

Fetches the branch and reviews code
Runs npm test && npm run lint locally to verify
Checks that the task-prompts file has a verification checklist — verifies all items
Confirms that a thought_record was written (by checking the AMS database or task log)
Approves and merges the PR to main

Hard rule: If writeback is missing, the merge is rejected. The code may be perfect, but without writeback, the task is silently flipped back to in_progress and cannot be promoted to done.

Stage 4 — Next Agent Starts

The next agent (or continuation of the same agent on a new task):

Reads CLAUDE.md — worktree rules and execution constraints
Reads AGENTS.md — agent roles and role-specific entry points
Runs task_next_actions — retrieves unblocked todo tasks
Picks an unblocked task — preference for critical-path tasks first
Creates a new worktree — follows the git worktree pattern from CLAUDE.md
Reads the task-prompts file — e.g., docs/guides/implementation/task-prompts/p0.1-infrastructure.md
Copies the ready-to-paste prompt — cold-starts the new task with full context

The Handoff Packet

Everything the completing agent leaves behind. A precise specification:

task_update fields

Field	Type	Purpose	Example
`task_id`	string	Task identifier	`P0.1.1`
`status`	enum	Current state	`done`
`progress`	number	0–100	`100`
`notes`	string	Brief human summary	`Package.json configured, ESLint passes, zero errors.`

thought_record fields

Field	Type	Purpose	Example
`task_id`	string	Task identifier	`P0.1.1`
`branch`	string	Feature branch name	`feature/p0-1-1-package-setup`
`commit_sha`	string	Full 40-char commit SHA	`abc123def456...`
`tests_run`	string[]	Test files / suites executed	`["smoke.test.ts", "eslint", "tsc --noEmit"]`
`summary`	string	1-paragraph narrative (3–5 sentences)	`Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings.`
`blockers`	object[]	Empty if clean; else structured blockers	`[]` or `[{type: "peer-dep", package: "zod", issue: "...", resolution: "..."}]`
`files_changed`	string[]	Paths created or modified	`["package.json", "tsconfig.json", ".eslintrc.json", ".prettierrc", ".env.example", ".gitignore", "src/__tests__/smoke.test.ts"]`
`related_thought_records`	string[]	IDs of earlier records in dependency chain	`[]` (for P0.1.1, first in critical path)

Complete example

{
  "task_id": "P0.1.1",
  "branch": "feature/p0-1-1-package-setup",
  "commit_sha": "abc123def456789abcdef456789abcdef456789a",
  "tests_run": ["smoke.test.ts", "eslint", "tsc --noEmit"],
  "summary": "Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings.",
  "blockers": [],
  "files_changed": [
    "package.json",
    "tsconfig.json",
    ".eslintrc.json",
    ".prettierrc",
    ".env.example",
    ".gitignore",
    "src/__tests__/smoke.test.ts"
  ],
  "related_thought_records": []
}

What the Next Agent Reads First

Ordered list of reading assignments for the next agent:

Latest thought_record for the project (gives current state, blockers, what was just done)
task_next_actions tool output (returns unblocked todo tasks in dependency order)
Task-prompts file for the chosen task (e.g., docs/guides/implementation/task-prompts/p0.1-infrastructure.md)
Extraction file referenced in the task-prompts (e.g., docs/reference/extractions/task-pipeline-algorithm.md if needed)
CLAUDE.md — re-read every session (worktree rules, execution constraints, writeback contract)
AGENTS.md — agent roles and role-specific entry points
docs/guides/agent-bootstrap.md — master bootstrap prompt (for cold-start agents)

Writeback Enforcement

Writeback is not optional. It is a hard runtime block.

Rule: Any tool that moves a task from in_progress or todo to done must enforce writeback before returning success.

Enforcement mechanism: After the agent calls task_update with status="done":

The AMS database checks: does a thought_record exist for this task_id with a recent timestamp?
If YES: permit the status change. Mark task as done in the database.
If NO: reject the status change silently. The task stays in_progress. The agent receives an error: "Writeback required: thought_record must be recorded before task completion." No exceptions.

What this prevents:

Agents finishing work and disappearing without audit trail
Orphaned tasks (code exists but no decision trail)
Loss of accountability

What happens if writeback fails:

Agent finishes coding, writes task_update(status="done"), but DB write fails (connection error, etc.)
Agent receives error: "Writeback failed: unable to insert thought_record. Retry with fresh DB connection or escalate."
Agent must retry writeback (with fresh connection) OR escalate to PM.
The task stays in_progress until writeback succeeds.

Failure Modes

Failure Mode 1: Agent A finishes, writeback fails

Scenario: Agent A completes task P0.1.1, calls thought_record, but database write times out.

Behavior:

Agent receives: Error: Writeback failed: unable to insert thought_record
Task status remains in_progress in the database
Code is already committed to feature branch

Resolution:

Agent retries writeback with fresh DB connection
OR escalates to PM with: branch name, commit SHA, and the thought_record JSON
PM manually writes the record via thought_record tool or database direct access

Failure Mode 2: Agent B picks a task A was working on

Scenario: Agent A is working on P0.1.1 (branch: feature/p0-1-1-package-setup). Agent B simultaneously picks P0.1.1.

Behavior:

Agent B runs git worktree add .worktrees/claude/p0-1-1-package-setup ...
Git detects branch already exists
Error: fatal: 'feature/p0-1-1-package-setup' already exists

Resolution:

Agent B checks task_next_actions again
Agent B picks a different unblocked task
Once Agent A completes and merges, the worktree is cleaned up and available again

Failure Mode 3: Agent B picks a task with a reverted PR

Scenario: Agent A completes P0.1.1, PR is merged, then a later PR reverts it (merge conflict resolution, regression found). Task status in DB still says done.

Behavior:

Agent B reads the latest thought_record: status: done, files_changed: [...]
Agent B checks main branch: files don’t exist
Agent B runs: git log --oneline -n 20 to see recent reverts

Resolution:

Agent B or PM calls thought_record with a new record: status: todo, notes: "Previous PR reverted, task reinstated"
Task is moved back to todo in the database
Agent B (or next available agent) picks it up and re-executes

Failure Mode 4: Two agents finalize Merkle simultaneously

Scenario: Agent A and Agent B both call merkle_finalize at the exact same time after P0.2.2 completes.

Behavior:

Merkle finalization is single-writer (SQLite lock)
First writer wins, commits the proof tree
Second writer receives: Error: Merkle finalization in progress. Retry in 5s.

Resolution:

Agent B retries merkle_finalize after 5 seconds
Merkle tree is already finalized; the call becomes a no-op (idempotent)
Both agents can safely continue

Diagram: Typical Handoff Sequence

sequenceDiagram
    participant AgentA as Agent A<br/>(P0.1.1)
    participant DB as AMS Database
    participant AgentB as Agent B<br/>(P0.1.2)

    AgentA->>AgentA: Code: P0.1.1
    AgentA->>AgentA: npm test ✓
    AgentA->>AgentA: git commit + push

    AgentA->>DB: task_update(id=P0.1.1, status=done, progress=100)
    AgentA->>DB: thought_record(id=P0.1.1, branch=..., commit_sha=..., summary=...)
    DB-->>DB: Verify thought_record exists
    DB-->>AgentA: ✓ Status updated to "done"

    DB->>DB: Writeback check: thought_record ✓ exists
    DB-->>AgentA: Merge permitted

    AgentA->>AgentA: PR reviewed, merged to main

    AgentB->>DB: task_next_actions()
    DB-->>AgentB: [P0.1.2, P0.1.4, ...]

    AgentB->>AgentB: Pick P0.1.2 (unblocked)
    AgentB->>AgentB: Read docs/guides/implementation/task-prompts/p0.1-infrastructure.md
    AgentB->>AgentB: git worktree add .worktrees/claude/p0-1-2-linter

    AgentB->>AgentB: Code: P0.1.2
    AgentB->>AgentB: npm test ✓
    AgentB->>AgentB: Writeback + commit

    AgentB->>DB: task_update(id=P0.1.2, status=done)
    AgentB->>DB: thought_record(id=P0.1.2, ...)
    DB-->>AgentB: ✓ Done

Why This Protocol Exists

The 28 Phase 0 tasks would take 6 weeks for a single human developer to ship. With proper handoff, a swarm of 5–8 agents can execute them in parallel once the critical path is laid down.

Key benefits:

Auditability: Every task has a thought_record. No orphaned code.
Parallelism: Next agent can start immediately after critical-path unblock (stage 4 before stage 3 completes if using async review).
Accountability: Merkle proofs anchor every decision. Tamper detection is cryptographic.
Recovery: Failed tasks are explicitly tracked. Rollback scenarios have decision trails.
Memory: Each agent inherits the full context from the previous agent’s thought_record. No context loss.

Without this protocol, agent A finishes work and agent B has no way to know what was done, what was tested, or what failures occurred. With it, B can cold-start on any task and inherit all context instantly.

Agent Handoff Protocol

The 4-Stage Handoff

Stage 1 — Completing Agent Finishes Work

Stage 2 — Completing Agent Writes Back

Stage 3 — PR Review & Merge

Stage 4 — Next Agent Starts

The Handoff Packet

task_update fields

thought_record fields

Complete example

What the Next Agent Reads First

Writeback Enforcement

Failure Modes

Failure Mode 1: Agent A finishes, writeback fails

Failure Mode 2: Agent B picks a task A was working on

Failure Mode 3: Agent B picks a task with a reverted PR

Failure Mode 4: Two agents finalize Merkle simultaneously

Diagram: Typical Handoff Sequence

Why This Protocol Exists

See Also