- Pi — the Pi inference engine, usually on a GPU host
- Hermes — the Hermes remote agent service
- Any custom remote agent that implements the HTTP adapter protocol
Why not just use the Process adapter remotely?
You could run a Process adapter on the remote host and point it at the same orchestrator. That works for one-off setups. The HTTP adapter is better when:- The remote runs as a long-lived service (not per-run)
- The runtime is expensive to start (GPU warm-up, model load) and you want to pool it across runs
- The remote is behind a firewall where only HTTP is allowed
- You want the remote to run as a managed service with its own scale-out independent of the orchestrator host
Configuring the HTTP adapter
Inconfig.json:
baseUrl— where the remote listensauth— how the adapter authenticates (bearer token, mTLS, or none for trusted networks)protocol— the adapter protocol version; right now onlyca-http-v1is definedcostModel— the pricing table key for cost attribution
The HTTP protocol
The adapter speaks three endpoints on the remote:POST /v1/sessions
Creates a new agent session. The adapter calls this at the
start of a run, passing:
GET /v1/sessions/{id}/events
Opens a Server-Sent Events stream. The remote emits events in
the same JSON shape as the Process adapter’s stdio protocol:
POST /v1/sessions/{id}/input
The adapter sends events back to the remote (tool results,
budget updates, shutdown signals):
DELETE /v1/sessions/{id}
Tears down the session. Called on normal exit or kill.
Authentication
Two supported schemes:- Bearer token: the adapter sends
Authorization: Bearer <token>on every request. The token is read from the env var named inauth.tokenEnv. Rotate the token the same way you rotate any other secret; the adapter picks it up on the next session (not on in-flight ones). - Mutual TLS: the adapter presents a client cert and the
remote verifies. Configure cert paths in
auth.certPathandauth.keyPath. Use this when running inside a service mesh.
Cost reporting
Remote agents report cost in their events the same way the Process adapter does: eachprogress or complete event can
include a cost field with model and token counts. The
adapter looks up the costModel in the pricing table and
computes USD.
If the remote does not have a predictable model (for example,
Pi runs its own engine with custom pricing), the remote can
send cost in USD directly:
Pooling and concurrency
The HTTP adapter maintains a pool of sessions per remote, sized bymaxConcurrentSessions. If the pool is full, new runs wait
until a session becomes available.
Failure modes
- Remote unreachable at start: the run fails fast with
remote_unavailable. The task is requeued per the workflow’s retry policy. - Remote unreachable mid-run: the adapter marks the run as failed, releases the lease, and the lease reaper on the orchestrator side picks up any stranded state.
- Lease expired while remote is busy: the remote’s next
write is rejected with a fencing token error, and the remote
is expected to abort the run on its side. If it does not, the
adapter force-kills the session by calling
DELETEand flagging the run. - Auth failure: the session creation fails with 401. The run is marked failed with a clear “auth” reason.
Writing a remote that targets this adapter
To build a remote agent service:- Expose the four endpoints above
- Accept session creation
- Run your agent loop
- Emit SSE events for progress, tool calls, comments, and completion
- Accept input events for tool results and shutdown
- Clean up on
DELETE /v1/sessions/{id}
packages/adapters/sdk/http/server.ts you can copy from.
Next
- Process adapter for local runtimes
- Creating an adapter for runtimes that do not fit either pattern