Cost control is the difference between an agent that can run in production and one that can only run under supervision. The orchestrator does most of the accounting, but the agent has a few jobs of its own.

Automatic accounting

The orchestrator captures cost automatically for:
  • LLM tokens through the adapter layer (every model call goes through the adapter, which reports tokens to the tool proxy)
  • Tool calls with priced outputs (any tool that was registered with a cost model reports its own cost on each call)
  • Runtime services (web preview minutes, crawler sessions, storage GB, priced per minute or per GB)
You do not have to instrument any of this. If you call llm.chat or tool.run or service.preview, the cost lands in the task’s budget envelope before you see the result.

The budget envelope

Every task run comes with a budget envelope:
{
  "tokens": 250000,
  "usd": 1.5,
  "remaining": { "tokens": 180000, "usd": 1.1 }
}
You can read the envelope at any time with get_budget. The orchestrator also pushes an update into the progress report whenever the remaining drops below a threshold. Before any expensive action (a long LLM call, a large tool call, a service spin-up), check the remaining. If the remaining is close to the action’s expected cost, either:
  • Shrink the action (smaller context, cheaper model, narrower scope)
  • Request a top-up via request_approval
  • Stop and escalate
Do not charge ahead and let the orchestrator kill you. A killed run produces nothing. A self-aware stop produces a report that explains why you stopped, which another agent can use.

Writing a cost note

When you finish a sub-step and want the reviewer to see its cost cleanly, call report_cost with a note:
{
  "step": "draft hero section",
  "tokens": 12400,
  "usd": 0.23,
  "notes": "Two rounds with the editor, one with the brand
    checker. The editor round was the expensive part."
}
Cost notes are not how costs are tracked (the orchestrator is already tracking them). They are how costs are explained to the reviewer. A cost note is worth writing when:
  • A sub-step was unexpectedly expensive and you want to say why
  • You chose an expensive path over a cheap one and want to justify the choice
  • Two similar sub-steps had very different costs and the reviewer would otherwise wonder why
Without the note, the reviewer sees the number and has to guess. With the note, the number is legible.

Budget-aware planning

A good agent plans with the budget in mind. Before starting a task:
  1. Read the task’s budget envelope
  2. Estimate the cost of each planned step (roughly, based on past runs or the skill’s cost.estimatedUsd field)
  3. Sum the estimates
  4. If the sum is more than 70% of the envelope, cut scope
The 30% slack exists for the things you did not plan for: extra tool calls, retries, context reloads, rescue loops. Running a plan at 100% of envelope leaves no room to recover from surprises. If your plan is honestly too small for the envelope (you estimate you will use 0.30ofa0.30 of a 1.50 budget) that is fine. Budgets are ceilings, not quotas. Do not spend money to use up the envelope.

Attributing cost to the right owner

When an agent delegates a sub-task, the orchestrator has to decide which budget envelope pays for the sub-task. The default is “the parent’s envelope”, which means the delegating agent is still responsible for the total cost. If you want the sub-task to be billed to a different team (for example, cross-team delegation under an interface agreement), set the costOwner field when you call delegate_subtask:
{
  "subtaskId": "task_b2c3d4",
  "costOwner": "team:engineering",
  "budget": { "tokens": 50000, "usd": 0.30 }
}
The orchestrator validates the cost owner against the interface agreement. If the two teams have no agreement, the call is rejected with cost_owner_unauthorized and the agent should fall back to parent-owned cost.

Breaches

There are three budget breach states, covered in full under costs and budgets. From the agent’s side:
  • Soft breach — you get a warning event; tighten your plan
  • Hard breach — your next expensive action will be refused; call request_approval for a top-up or escalate
  • Catastrophic breach — your process is killed by the orchestrator; you do not get to handle this one
Handle soft breaches in your own logic. Expect hard breaches only if your planning was off. Never count on handling a catastrophic breach; the kill is non-recoverable.

What to put in your final report

When you call mark_complete, include a cost summary:
{
  "output": { /* ... */ },
  "cost": {
    "totalUsd": 1.12,
    "breakdown": {
      "llmTokens": 0.78,
      "toolCalls": 0.22,
      "runtimeServices": 0.12
    },
    "notes": "Spent more on LLM than expected because the brand
      voice checker ran three rounds instead of one."
  }
}
The orchestrator already has the numbers. Your job is to explain them in one or two lines. A one-line cost explanation at mark_complete time is the single highest-leverage thing you can do for future reviewers reading your runs.

Next