Agent Handoff Protocol

When an agent completes a task, the next agent must be able to pick up cleanly. This document defines what the completing agent leaves behind and what the next agent reads first.


The 4-Stage Handoff

Stage 1 — Completing Agent Finishes Work

The executing agent:

  1. Completes the task implementation (writes code, creates files, makes changes)
  2. Runs full test suite: npm test && npm run lint
  3. Verifies all acceptance criteria are met (against task-prompts file)
  4. Creates a commit with clear message: task: P0.X.Y — [title]
  5. Pushes to the feature branch (e.g., feature/p0-x-y-slug)

Success signal: All tests pass, no lint errors, acceptance criteria verified.


Stage 2 — Completing Agent Writes Back

The completing agent uses the MCP writeback tools:

  1. Call task_update with:
    • task_id: e.g., P0.1.1
    • status: "done"
    • progress: 100
    • notes: Brief summary (1-2 sentences)
  2. Call thought_record with:
    • task_id: e.g., P0.1.1
    • branch: e.g., feature/p0-1-1-package-setup
    • commit_sha: Full commit SHA
    • tests_run: Array of test names (e.g., ["smoke.test.ts", "eslint", "tsc"])
    • summary: What was implemented (1 paragraph, 3-5 sentences)
    • blockers: Empty array if clean; structured objects if blockers exist
  3. For proof-grade work, also run the full chain:
    • audit_session_start (marks start of auditable session)
    • Execute work (already done in Stage 1)
    • audit_verify_chain (verify hash chain integrity)
    • thought_record (as above)
    • merkle_finalize (seal proof into Merkle tree)
    • merkle_root (retrieve final root hash)

Critical: Do not finalize the Merkle tree before writing the final thought record.


Stage 3 — PR Review & Merge

A reviewer agent (or human PM):

  1. Fetches the branch and reviews code
  2. Runs npm test && npm run lint locally to verify
  3. Checks that the task-prompts file has a verification checklist — verifies all items
  4. Confirms that a thought_record was written (by checking the AMS database or task log)
  5. Approves and merges the PR to main

Hard rule: If writeback is missing, the merge is rejected. The code may be perfect, but without writeback, the task is silently flipped back to in_progress and cannot be promoted to done.


Stage 4 — Next Agent Starts

The next agent (or continuation of the same agent on a new task):

  1. Reads CLAUDE.md — worktree rules and execution constraints
  2. Reads AGENTS.md — agent roles and role-specific entry points
  3. Runs task_next_actions — retrieves unblocked todo tasks
  4. Picks an unblocked task — preference for critical-path tasks first
  5. Creates a new worktree — follows the git worktree pattern from CLAUDE.md
  6. Reads the task-prompts file — e.g., docs/guides/implementation/task-prompts/p0.1-infrastructure.md
  7. Copies the ready-to-paste prompt — cold-starts the new task with full context

The Handoff Packet

Everything the completing agent leaves behind. A precise specification:

task_update fields

Field Type Purpose Example
task_id string Task identifier P0.1.1
status enum Current state done
progress number 0–100 100
notes string Brief human summary Package.json configured, ESLint passes, zero errors.

thought_record fields

Field Type Purpose Example
task_id string Task identifier P0.1.1
branch string Feature branch name feature/p0-1-1-package-setup
commit_sha string Full 40-char commit SHA abc123def456...
tests_run string[] Test files / suites executed ["smoke.test.ts", "eslint", "tsc --noEmit"]
summary string 1-paragraph narrative (3–5 sentences) Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings.
blockers object[] Empty if clean; else structured blockers [] or [{type: "peer-dep", package: "zod", issue: "...", resolution: "..."}]
files_changed string[] Paths created or modified ["package.json", "tsconfig.json", ".eslintrc.json", ".prettierrc", ".env.example", ".gitignore", "src/__tests__/smoke.test.ts"]
related_thought_records string[] IDs of earlier records in dependency chain [] (for P0.1.1, first in critical path)

Complete example

{
  "task_id": "P0.1.1",
  "branch": "feature/p0-1-1-package-setup",
  "commit_sha": "abc123def456789abcdef456789abcdef456789a",
  "tests_run": ["smoke.test.ts", "eslint", "tsc --noEmit"],
  "summary": "Initialized package.json with ESM-first setup (type: module), added TypeScript 5.3+ with strict mode, configured ESLint + Prettier, created .env.example with 85 documented vars. All dev tools included: tsx for development, tsc for build. npm install succeeds with zero warnings.",
  "blockers": [],
  "files_changed": [
    "package.json",
    "tsconfig.json",
    ".eslintrc.json",
    ".prettierrc",
    ".env.example",
    ".gitignore",
    "src/__tests__/smoke.test.ts"
  ],
  "related_thought_records": []
}

What the Next Agent Reads First

Ordered list of reading assignments for the next agent:

  1. Latest thought_record for the project (gives current state, blockers, what was just done)
  2. task_next_actions tool output (returns unblocked todo tasks in dependency order)
  3. Task-prompts file for the chosen task (e.g., docs/guides/implementation/task-prompts/p0.1-infrastructure.md)
  4. Extraction file referenced in the task-prompts (e.g., docs/reference/extractions/task-pipeline-algorithm.md if needed)
  5. CLAUDE.md — re-read every session (worktree rules, execution constraints, writeback contract)
  6. AGENTS.md — agent roles and role-specific entry points
  7. docs/guides/agent-bootstrap.md — master bootstrap prompt (for cold-start agents)

Writeback Enforcement

Writeback is not optional. It is a hard runtime block.

Rule: Any tool that moves a task from in_progress or todo to done must enforce writeback before returning success.

Enforcement mechanism: After the agent calls task_update with status="done":

  1. The AMS database checks: does a thought_record exist for this task_id with a recent timestamp?
  2. If YES: permit the status change. Mark task as done in the database.
  3. If NO: reject the status change silently. The task stays in_progress. The agent receives an error: "Writeback required: thought_record must be recorded before task completion." No exceptions.

What this prevents:

  • Agents finishing work and disappearing without audit trail
  • Orphaned tasks (code exists but no decision trail)
  • Loss of accountability

What happens if writeback fails:

  1. Agent finishes coding, writes task_update(status="done"), but DB write fails (connection error, etc.)
  2. Agent receives error: "Writeback failed: unable to insert thought_record. Retry with fresh DB connection or escalate."
  3. Agent must retry writeback (with fresh connection) OR escalate to PM.
  4. The task stays in_progress until writeback succeeds.

Failure Modes

Failure Mode 1: Agent A finishes, writeback fails

Scenario: Agent A completes task P0.1.1, calls thought_record, but database write times out.

Behavior:

  • Agent receives: Error: Writeback failed: unable to insert thought_record
  • Task status remains in_progress in the database
  • Code is already committed to feature branch

Resolution:

  • Agent retries writeback with fresh DB connection
  • OR escalates to PM with: branch name, commit SHA, and the thought_record JSON
  • PM manually writes the record via thought_record tool or database direct access

Failure Mode 2: Agent B picks a task A was working on

Scenario: Agent A is working on P0.1.1 (branch: feature/p0-1-1-package-setup). Agent B simultaneously picks P0.1.1.

Behavior:

  • Agent B runs git worktree add .worktrees/claude/p0-1-1-package-setup ...
  • Git detects branch already exists
  • Error: fatal: 'feature/p0-1-1-package-setup' already exists

Resolution:

  • Agent B checks task_next_actions again
  • Agent B picks a different unblocked task
  • Once Agent A completes and merges, the worktree is cleaned up and available again

Failure Mode 3: Agent B picks a task with a reverted PR

Scenario: Agent A completes P0.1.1, PR is merged, then a later PR reverts it (merge conflict resolution, regression found). Task status in DB still says done.

Behavior:

  • Agent B reads the latest thought_record: status: done, files_changed: [...]
  • Agent B checks main branch: files don’t exist
  • Agent B runs: git log --oneline -n 20 to see recent reverts

Resolution:

  • Agent B or PM calls thought_record with a new record: status: todo, notes: "Previous PR reverted, task reinstated"
  • Task is moved back to todo in the database
  • Agent B (or next available agent) picks it up and re-executes

Failure Mode 4: Two agents finalize Merkle simultaneously

Scenario: Agent A and Agent B both call merkle_finalize at the exact same time after P0.2.2 completes.

Behavior:

  • Merkle finalization is single-writer (SQLite lock)
  • First writer wins, commits the proof tree
  • Second writer receives: Error: Merkle finalization in progress. Retry in 5s.

Resolution:

  • Agent B retries merkle_finalize after 5 seconds
  • Merkle tree is already finalized; the call becomes a no-op (idempotent)
  • Both agents can safely continue

Diagram: Typical Handoff Sequence

sequenceDiagram
    participant AgentA as Agent A<br/>(P0.1.1)
    participant DB as AMS Database
    participant AgentB as Agent B<br/>(P0.1.2)

    AgentA->>AgentA: Code: P0.1.1
    AgentA->>AgentA: npm test ✓
    AgentA->>AgentA: git commit + push

    AgentA->>DB: task_update(id=P0.1.1, status=done, progress=100)
    AgentA->>DB: thought_record(id=P0.1.1, branch=..., commit_sha=..., summary=...)
    DB-->>DB: Verify thought_record exists
    DB-->>AgentA: ✓ Status updated to "done"

    DB->>DB: Writeback check: thought_record ✓ exists
    DB-->>AgentA: Merge permitted

    AgentA->>AgentA: PR reviewed, merged to main

    AgentB->>DB: task_next_actions()
    DB-->>AgentB: [P0.1.2, P0.1.4, ...]

    AgentB->>AgentB: Pick P0.1.2 (unblocked)
    AgentB->>AgentB: Read docs/guides/implementation/task-prompts/p0.1-infrastructure.md
    AgentB->>AgentB: git worktree add .worktrees/claude/p0-1-2-linter

    AgentB->>AgentB: Code: P0.1.2
    AgentB->>AgentB: npm test ✓
    AgentB->>AgentB: Writeback + commit

    AgentB->>DB: task_update(id=P0.1.2, status=done)
    AgentB->>DB: thought_record(id=P0.1.2, ...)
    DB-->>AgentB: ✓ Done

Why This Protocol Exists

The 28 Phase 0 tasks would take 6 weeks for a single human developer to ship. With proper handoff, a swarm of 5–8 agents can execute them in parallel once the critical path is laid down.

Key benefits:

  1. Auditability: Every task has a thought_record. No orphaned code.
  2. Parallelism: Next agent can start immediately after critical-path unblock (stage 4 before stage 3 completes if using async review).
  3. Accountability: Merkle proofs anchor every decision. Tamper detection is cryptographic.
  4. Recovery: Failed tasks are explicitly tracked. Rollback scenarios have decision trails.
  5. Memory: Each agent inherits the full context from the previous agent’s thought_record. No context loss.

Without this protocol, agent A finishes work and agent B has no way to know what was done, what was tested, or what failures occurred. With it, B can cold-start on any task and inherit all context instantly.


See Also


Back to top

Colibri — documentation-first MCP runtime. Apache 2.0 + Commons Clause.

This site uses Just the Docs, a documentation theme for Jekyll.