The HTTP adapter is for agents that do not run on the same host as the orchestrator. The target runs somewhere else (a GPU box, a managed service, another server on your network) and exposes an HTTP endpoint. The adapter holds a session, streams events over Server-Sent Events, and relays tool calls through the MCP server. Use it for:
  • Pi — the Pi inference engine, usually on a GPU host
  • Hermes — the Hermes remote agent service
  • Any custom remote agent that implements the HTTP adapter protocol

Why not just use the Process adapter remotely?

You could run a Process adapter on the remote host and point it at the same orchestrator. That works for one-off setups. The HTTP adapter is better when:
  • The remote runs as a long-lived service (not per-run)
  • The runtime is expensive to start (GPU warm-up, model load) and you want to pool it across runs
  • The remote is behind a firewall where only HTTP is allowed
  • You want the remote to run as a managed service with its own scale-out independent of the orchestrator host

Configuring the HTTP adapter

In config.json:
{
  "adapters": {
    "pi": {
      "type": "http",
      "baseUrl": "http://gpu-box.internal:7001",
      "auth": {
        "type": "bearer",
        "tokenEnv": "PI_API_TOKEN"
      },
      "protocol": "ca-http-v1",
      "costModel": "custom/pi-2.5"
    },
    "hermes": {
      "type": "http",
      "baseUrl": "https://hermes.example.com",
      "auth": {
        "type": "bearer",
        "tokenEnv": "HERMES_API_TOKEN"
      },
      "protocol": "ca-http-v1",
      "costModel": "custom/hermes-1.0"
    }
  }
}
Required fields:
  • baseUrl — where the remote listens
  • auth — how the adapter authenticates (bearer token, mTLS, or none for trusted networks)
  • protocol — the adapter protocol version; right now only ca-http-v1 is defined
  • costModel — the pricing table key for cost attribution

The HTTP protocol

The adapter speaks three endpoints on the remote:

POST /v1/sessions

Creates a new agent session. The adapter calls this at the start of a run, passing:
{
  "runId": "run_a1b2c3",
  "taskId": "task_d4e5f6",
  "leaseToken": "lease_...",
  "fencingToken": 17,
  "mcpUrl": "http://orchestrator.internal:4200",
  "agent": {
    "id": "agent_g7h8i9",
    "name": "Pi Writer",
    "role": "content-writer",
    "policy": {...}
  },
  "task": {...},
  "context": {...}
}
The remote responds with a session ID that subsequent calls use.

GET /v1/sessions/{id}/events

Opens a Server-Sent Events stream. The remote emits events in the same JSON shape as the Process adapter’s stdio protocol:
event: progress
data: {"summary": "...", "beliefs": [...]}

event: tool_call
data: {"id": "1", "tool": "read_task", "args": {}}

event: complete
data: {"output": {...}, "cost": {...}}
The adapter holds the stream open for the lifetime of the run.

POST /v1/sessions/{id}/input

The adapter sends events back to the remote (tool results, budget updates, shutdown signals):
{"type": "tool_result", "id": "1", "ok": true, "value": "..."}
{"type": "shutdown", "reason": "lease_expired"}

DELETE /v1/sessions/{id}

Tears down the session. Called on normal exit or kill.

Authentication

Two supported schemes:
  • Bearer token: the adapter sends Authorization: Bearer <token> on every request. The token is read from the env var named in auth.tokenEnv. Rotate the token the same way you rotate any other secret; the adapter picks it up on the next session (not on in-flight ones).
  • Mutual TLS: the adapter presents a client cert and the remote verifies. Configure cert paths in auth.certPath and auth.keyPath. Use this when running inside a service mesh.
No auth is also supported for trusted networks but is a bad idea outside dev.

Cost reporting

Remote agents report cost in their events the same way the Process adapter does: each progress or complete event can include a cost field with model and token counts. The adapter looks up the costModel in the pricing table and computes USD. If the remote does not have a predictable model (for example, Pi runs its own engine with custom pricing), the remote can send cost in USD directly:
{"type": "complete", "output": {...}, "cost": {"usd": 0.42, "notes": "..."}}
The adapter takes the USD at face value and skips the model lookup.

Pooling and concurrency

The HTTP adapter maintains a pool of sessions per remote, sized by maxConcurrentSessions. If the pool is full, new runs wait until a session becomes available.
{
  "pi": {
    "type": "http",
    "baseUrl": "...",
    "maxConcurrentSessions": 8
  }
}
A single GPU box might support 8 concurrent Pi sessions; set the number to match.

Failure modes

  • Remote unreachable at start: the run fails fast with remote_unavailable. The task is requeued per the workflow’s retry policy.
  • Remote unreachable mid-run: the adapter marks the run as failed, releases the lease, and the lease reaper on the orchestrator side picks up any stranded state.
  • Lease expired while remote is busy: the remote’s next write is rejected with a fencing token error, and the remote is expected to abort the run on its side. If it does not, the adapter force-kills the session by calling DELETE and flagging the run.
  • Auth failure: the session creation fails with 401. The run is marked failed with a clear “auth” reason.

Writing a remote that targets this adapter

To build a remote agent service:
  1. Expose the four endpoints above
  2. Accept session creation
  3. Run your agent loop
  4. Emit SSE events for progress, tool calls, comments, and completion
  5. Accept input events for tool results and shutdown
  6. Clean up on DELETE /v1/sessions/{id}
The SDK has a TypeScript reference implementation in packages/adapters/sdk/http/server.ts you can copy from.

Next