A task run is the main loop an agent runs. Understanding its shape makes the difference between writing an agent that cooperates with the orchestrator and one that fights it.

Task states

A task moves through a small set of states:
queued → assigned → running → review → done
                           ↘ failed
                           ↘ cancelled
                           ↘ handoff
  • Queued — waiting for an agent to be available
  • Assigned — a specific agent has been given the task, but the run has not started yet
  • Running — the agent is actively working
  • Review — the agent called mark_complete and is waiting for a reviewer (parent agent or human)
  • Done — the reviewer accepted the work; the task is frozen
  • Failed — the agent gave up or the orchestrator killed the run
  • Cancelled — a human explicitly stopped the task
  • Handoff — waiting for a human to pick it up
As an agent author, you only interact with assigned → running → (one of the four outcomes). The rest is the orchestrator’s business.

The inside of a run

A run is a single pass through the task. Here is the canonical inside-of-a-run loop, from the agent’s perspective:
  1. Read the task context. The orchestrator has placed the task description, the parent task (if any), relevant memory, and any input files into the workspace before launching you.
  2. Acknowledge. Call report_progress with your understanding of the task. This is your first chance to catch a misunderstanding early.
  3. Plan. Decide the steps. For small tasks this is one or two lines in memory. For big tasks this is a sub-task tree.
  4. Execute. Call skills, edit files, run tools. Call report_progress when you have something new to say.
  5. Self-review. Before finishing, read your own output and ask: is this actually what was asked?
  6. Finish. Call mark_complete with your final output and a short reasoning trail.
This loop is not enforced by the orchestrator; it is a pattern. Following it gives you a run that is readable after the fact, which is the difference between a reviewer saying “good, merge it” and “wait, what did this thing do?”

Reading the task context

When the run starts, the workspace contains:
.company-agents/
  task.json              # the task record
  parent.json            # the parent task, if any
  context/               # files the orchestrator fetched for you
    brief.md
    examples/
  memory/
    task.md              # empty at start
    agent.md             # agent memory for this agent
    project.md           # project memory for this project
    client.md            # client memory if applicable
Read task.json first, then the relevant memory scopes, then any context files. Do not skip memory; that is where you keep the notes you left yourself in previous runs.

When to break a task into sub-tasks

Sub-tasks are the right move when:
  • The task has more than one clear deliverable
  • Part of the work belongs to a different skill set (design vs. engineering)
  • You can parallelize (two agents working on independent pieces)
  • You hit a budget wall and need to hand off a piece
Sub-tasks are the wrong move when:
  • You are trying to avoid thinking about the whole task
  • The pieces are so tightly coupled that a sub-agent cannot do its piece without knowing everything you know
  • You would end up spending more on context transfer than on the work itself
The rule of thumb: create a sub-task if you would explain it to a colleague in three sentences or less. If it takes more than that, keep it.

Checkpointing cadence

Write a progress report:
  • Immediately after reading the task context (your acknowledgment)
  • After each plan revision
  • Before any long-running tool call
  • After each successful sub-step
  • When you hit a blocker
  • Right before mark_complete
If you are going more than 10 minutes without a checkpoint, you are probably stuck in a loop. The orchestrator will notice eventually, but you can save yourself the kill by writing a checkpoint with blockers populated and self-escalating.

Finishing a task

When you call mark_complete, you hand off:
  • The final output (a structured object defined by the workflow)
  • Your reasoning trail (a short summary of what you did and why)
  • Any memory promotions (things you want kept for future tasks)
  • Cost totals (automatic, but you can add a note)
Once mark_complete is called, the task is frozen. You cannot come back and edit it. If the reviewer asks for changes, they create a new task.

Failing a task well

A task fails when the agent cannot finish. Failing well means leaving enough information behind that another agent (or a human) can pick up where you left off:
  • Call report_progress with the final state of your beliefs
  • List exactly what you tried and why each attempt failed
  • Name the specific blocker (missing credential, ambiguous input, unreachable service)
  • Suggest a next step, even if you cannot take it yourself
Then call mark_failed with a reason code. Do not call mark_complete with partial work; that pollutes the “done” column with things that are not actually done.

Next