Company Agents used to call this “the heartbeat protocol” during early development, because agents sent a tick every few seconds to prove they were alive. That design was part of what made Paperclip painful, so it is gone. The current protocol is lease-based, not heartbeat-based, and the URL kept the old name out of habit. Here is what the protocol actually looks like.

The three moving parts

  1. The lease — the orchestrator’s statement that a specific agent owns a specific task for a specific time window.
  2. Progress reports — structured checkpoints the agent writes when it has something to say, not on a timer.
  3. Fencing tokens — monotonic counters that prevent a revived zombie from writing over a fresh owner.
The adapter handles most of the wire-level details. This guide covers the parts you need to reason about when writing the agent’s own logic.

Lease acquisition

When the orchestrator assigns a task, it hands the adapter a lease record:
{
  "taskId": "task_a1b2c3",
  "runId": "run_d4e5f6",
  "leaseExpiresAt": "2026-04-11T10:42:17Z",
  "fencingToken": 17,
  "budgetEnvelope": { "tokens": 250000, "usd": 1.5 },
  "workspacePath": "/Users/you/.company-agents/instances/default/runs/run_d4e5f6"
}
The adapter launches the agent process with the lease information available (usually as environment variables). The agent uses the fencing token on every write to the tool proxy. The proxy rejects anything with a stale token.

Lease renewal

The default lease window is 5 minutes. The adapter renews automatically at the 60% mark, not the 100% mark, to give itself room to retry if the renewal call fails. Renewal is asynchronous and cheap (a single HTTP call to the tool proxy). You do not call it from your agent logic. If you find yourself thinking about renewals, something is wrong. If the adapter cannot renew (network dropped, orchestrator down, the agent’s process is hung), the lease expires and the orchestrator reassigns the task. Any writes the old agent attempts after that point are rejected.

Progress reports (structured checkpoints)

The agent writes progress reports using the report_progress tool. A report looks like this:
{
  "summary": "Drafted the hero section and gathered brand assets",
  "beliefs": [
    "Client brand color is #3ecf8e",
    "Hero should show the org chart animation, not a product shot"
  ],
  "attempted": [
    { "action": "fetch brand assets", "outcome": "success" },
    { "action": "draft hero copy", "outcome": "pending review" }
  ],
  "nextStep": "Write three variants of the hero copy and pick one",
  "blockers": []
}
Reports are written when the agent has something new to say, not on a fixed clock. Typical triggers:
  • At the start of a task, after reading the context
  • When the agent has finished one sub-step and is about to start another
  • When the agent hits a decision point and wants a record of what it decided and why
  • Before any tool call that will take a long time (a crawl, a build, a model run)
  • Whenever the agent thinks the current context might get compacted
The adapter’s context-compaction code treats the most recent progress report as a pinned message: it will survive the compaction and give the agent something to rehydrate from.

Release, not finish

When an agent thinks the task is done, it calls mark_complete with its final output. That call releases the lease and moves the task to the review state for a parent agent or a human to decide whether it actually is done. A completed task cannot be edited. If the reviewer decides the work is wrong, they create a new task for the fix. This is a deliberate choice: mutable task history is a source of audit trail drift. If the agent wants to hand off to a human mid-task, it calls request_human_help, which releases the lease and parks the task in the handoff queue. If the agent wants to hand off to another agent mid-task, it calls delegate_subtask, which creates a child task under the current one and continues to hold the parent lease.

What happens on failure

If the agent process dies:
  1. The adapter notices via process group monitoring and sends a release to the tool proxy
  2. The orchestrator marks the run as failed and captures the exit code
  3. The workspace is preserved (failed runs are kept for 7 days by default)
  4. The task is either retried (if the workflow says so) or escalated to a human
If the adapter itself dies:
  1. The lease expires on its own schedule
  2. The orchestrator reaps the lease via the lease reaper job
  3. Same as above from step 3
If the orchestrator dies:
  1. Nothing writes. The agent’s next tool call fails with a clear “orchestrator unavailable” error.
  2. The agent’s adapter should exit cleanly; if it does not, the user kills it manually.
  3. When the orchestrator restarts, it resumes pending runs from their last checkpoint. Agents restart from the last checkpoint, not from scratch.

Tool calls that cost money

Any tool call that touches real infrastructure (spawns a subprocess, makes a network request, writes to a real database) is metered. The tool proxy adds the cost to the agent’s budget envelope before returning the result. If the call would put the agent over its envelope, the call is refused with a budget_exceeded error. The agent can catch this, report progress with blockers, and either self-escalate or wait for a top-up.

Summary

  • Lease holds for 5 minutes by default, renewed automatically at 60% of the window
  • Progress reports are written when something changes, not on a timer
  • Fencing tokens make zombie writes impossible
  • mark_complete releases the lease and moves the task to review
  • request_human_help releases the lease and parks the task
  • delegate_subtask creates a child and holds the parent lease
  • Tool calls are metered and can be refused if over-budget

Next