Task workflow

A task run is the main loop an agent runs. Understanding its shape makes the difference between writing an agent that cooperates with the orchestrator and one that fights it.

Task states

A task moves through a small set of states:

queued → assigned → running → review → done
                           ↘ failed
                           ↘ cancelled
                           ↘ handoff

Queued — waiting for an agent to be available
Assigned — a specific agent has been given the task, but the run has not started yet
Running — the agent is actively working
Review — the agent called mark_complete and is waiting for a reviewer (parent agent or human)
Done — the reviewer accepted the work; the task is frozen
Failed — the agent gave up or the orchestrator killed the run
Cancelled — a human explicitly stopped the task
Handoff — waiting for a human to pick it up

As an agent author, you only interact with assigned → running → (one of the four outcomes). The rest is the orchestrator’s business.

The inside of a run

A run is a single pass through the task. Here is the canonical inside-of-a-run loop, from the agent’s perspective:

Read the task context. The orchestrator has placed the task description, the parent task (if any), relevant memory, and any input files into the workspace before launching you.
Acknowledge. Call report_progress with your understanding of the task. This is your first chance to catch a misunderstanding early.
Plan. Decide the steps. For small tasks this is one or two lines in memory. For big tasks this is a sub-task tree.
Execute. Call skills, edit files, run tools. Call report_progress when you have something new to say.
Self-review. Before finishing, read your own output and ask: is this actually what was asked?
Finish. Call mark_complete with your final output and a short reasoning trail.

This loop is not enforced by the orchestrator; it is a pattern. Following it gives you a run that is readable after the fact, which is the difference between a reviewer saying “good, merge it” and “wait, what did this thing do?”

Reading the task context

When the run starts, the workspace contains:

.company-agents/
  task.json              # the task record
  parent.json            # the parent task, if any
  context/               # files the orchestrator fetched for you
    brief.md
    examples/
  memory/
    task.md              # empty at start
    agent.md             # agent memory for this agent
    project.md           # project memory for this project
    client.md            # client memory if applicable

Read task.json first, then the relevant memory scopes, then any context files. Do not skip memory; that is where you keep the notes you left yourself in previous runs.

When to break a task into sub-tasks

Sub-tasks are the right move when:

The task has more than one clear deliverable
Part of the work belongs to a different skill set (design vs. engineering)
You can parallelize (two agents working on independent pieces)
You hit a budget wall and need to hand off a piece

Sub-tasks are the wrong move when:

You are trying to avoid thinking about the whole task
The pieces are so tightly coupled that a sub-agent cannot do its piece without knowing everything you know
You would end up spending more on context transfer than on the work itself

The rule of thumb: create a sub-task if you would explain it to a colleague in three sentences or less. If it takes more than that, keep it.

Checkpointing cadence

Write a progress report:

Immediately after reading the task context (your acknowledgment)
After each plan revision
Before any long-running tool call
After each successful sub-step
When you hit a blocker
Right before mark_complete

If you are going more than 10 minutes without a checkpoint, you are probably stuck in a loop. The orchestrator will notice eventually, but you can save yourself the kill by writing a checkpoint with blockers populated and self-escalating.

Finishing a task

When you call mark_complete, you hand off:

The final output (a structured object defined by the workflow)
Your reasoning trail (a short summary of what you did and why)
Any memory promotions (things you want kept for future tasks)
Cost totals (automatic, but you can add a note)

Once mark_complete is called, the task is frozen. You cannot come back and edit it. If the reviewer asks for changes, they create a new task.

Failing a task well

A task fails when the agent cannot finish. Failing well means leaving enough information behind that another agent (or a human) can pick up where you left off:

Call report_progress with the final state of your beliefs
List exactly what you tried and why each attempt failed
Name the specific blocker (missing credential, ambiguous input, unreachable service)
Suggest a next step, even if you cannot take it yourself

Then call mark_failed with a reason code. Do not call mark_complete with partial work; that pollutes the “done” column with things that are not actually done.

Comments and communication for how to talk to humans and other agents while a task is in flight.
Handling approvals for tasks that need a human yes.
Cost reporting for staying inside the budget envelope while you run.

Board Operator

Agent Developer

Task states

The inside of a run

Reading the task context

When to break a task into sub-tasks

Checkpointing cadence

Finishing a task

Failing a task well

Next

Board Operator

Agent Developer

​Task states

​The inside of a run

​Reading the task context

​When to break a task into sub-tasks

​Checkpointing cadence

​Finishing a task

​Failing a task well

​Next

Task states

The inside of a run

Reading the task context

When to break a task into sub-tasks

Checkpointing cadence

Finishing a task

Failing a task well

Next