Add a Provider
SandboxProvider interface, capabilities, and registry.
The donmai runtime's capability matrix and base-contract validator live in the OSS layer. See donmai.dev/docs/sandbox-capability-matrix for the complementary OSS reference. This page covers the Rensei platform's provider interface, registration, and capability routing.
Rensei routes agent sessions to execution sandboxes through a typed provider interface. This page describes how to implement and register a SandboxProvider, what the capability matrix controls, and how the runner-in-box execution model affects your implementation.
Dispatch flow
Every agent session travels this path from trigger to worker:
flowchart TD
A([Workflow trigger / dispatch API]) --> B[Resolve model profile]
B --> C{Auth mode\nlocal / host-session?}
C -- yes --> D[validateResolvedProfile\nlocal pool required]
C -- no --> E[Filter active pools\nby orgId + providerId]
D --> E
E --> F[filterPoolsForRuntimeRequirements\nagent card substrate vs pool capabilities]
F --> G{Pool found?}
G -- no --> H([Dispatch error - no eligible pool])
G -- yes --> I[Select pool\nhighest-priority match]
I --> J[SandboxProvider.provision\nspec incl. toolchainDemand]
J --> K[SandboxHandle returned\nexternalId, metadata]
K --> L[Work item queued\nregistrationToken injected]
L --> M([donmai runtime polls,\nclaims, installs kits, runs agent])The scheduler never calls a provider directly by identity - it filters by capability flags, substrate requirements, and auth-mode constraints, then calls provision on the winning pool's provider. Your provider only needs to implement the interface; routing is handled by the platform.
The SandboxProvider interface
Every sandbox provider implements SandboxProvider from src/lib/providers/sandbox/types.ts.
export interface SandboxProvider {
readonly id: string // stable identifier - matches pool `providerId` column
readonly displayName: string
readonly capabilities: SandboxProviderCapabilities
provision(spec: SandboxSpec): Promise<SandboxHandle>
status(handle: SandboxHandle): Promise<SandboxStatus>
terminate(handle: SandboxHandle): Promise<void>
// Optional
streamLogs?(handle: SandboxHandle): AsyncIterable<string>
execCommand?(
handle: SandboxHandle,
command: string,
env?: Record<string, string>,
): Promise<SandboxExecResult>
}Required methods
provision(spec) - allocate a sandbox and return a SandboxHandle. The spec carries projectId, orgId, registrationToken, injected env, optional image / resources, and config (from the pool's projects.sandbox_config JSONB). If your provider is credential-gated, call resolveProviderCredential(orgId, 'yourProvider') to retrieve the stored API key.
status(handle) - return one of 'provisioning' | 'ready' | 'running' | 'terminated' | 'failed'.
terminate(handle) - shut down and clean up the sandbox. Call emitSandboxCostEvent({ handle, orgId }) here (non-fatal, fire-and-forget) so wall-clock costs land in the billing ledger.
Optional methods
streamLogs(handle) - async-iterable log stream for the session detail view.
execCommand(handle, command, env) - run a one-shot shell command inside a provisioned sandbox. Only E2B implements this today. This is not the kit-install path (see Runner-in-box below).
SandboxHandle
interface SandboxHandle {
providerId: string // must match SandboxProvider.id
externalId: string // provider-side ID (container ID, job name, sandbox UUID…)
metadata: Record<string, unknown>
}Always store orgId in metadata. The terminate and status methods need it to resolve credentials on subsequent calls.
SandboxSpec
interface SandboxSpec {
projectId: string
orgId: string
registrationToken: string // one-time rsp_live_... token injected into worker env
env: Record<string, string>
image?: string
resources?: { cpu?: string; memoryMB?: number; timeoutSec?: number }
config: Record<string, unknown> // from projects.sandbox_config JSONB
toolchainDemand?: ComposedToolchainDemand // runner-in-box kit demand
}Capability matrix
The scheduler routes sessions to pools using SandboxProviderCapabilities. Declare these accurately - over-declaring silently fails at session time.
Prop
Type
Substrate capabilities
Beyond the capability matrix, the scheduler matches pools against agent card substrate requirements. These are declared per-pool in the execution_provider_pools.runtime_provides JSONB column, or derived from provider-class defaults when the column is null.
Runtime kinds
Declare which AgentCard.runtimes[].kind values the pool can satisfy:
| Kind | Description |
|---|---|
native | Binary process on the host |
npm | Node.js / npm package |
python-pip | Python pip package |
http | HTTP service |
mcp-server | MCP protocol server |
a2a-protocol | A2A protocol agent |
host-binary | Host-installed binary (local only) |
workarea | Persistent filesystem workarea |
vendor-hosted | External SaaS (no local compute) |
langchain-runnable | LangChain runnable |
Requirement kinds
Declare which AgentCard.requires[].kind values the pool can satisfy:
| Kind | Class default providers |
|---|---|
persistent-storage | local, docker, kubernetes |
long-running | local, docker, kubernetes, e2b |
workarea | local, docker, kubernetes |
host-binary | local only |
network-egress | All cloud providers; local subject to host policy |
gpu | Operator-declared only - no class default |
git | All providers |
full-history-clone | All providers |
toolchain:go | local, docker, kubernetes (baked in worker image) |
toolchain:node | local, docker, kubernetes, vercel (Node baked by default) |
For e2b, daytona, and modal, toolchain:go and toolchain:node are not class defaults. The bare base template bakes neither the donmai binary nor any language toolchain. Operators must bake the toolchain into the pool template and declare it via a runtime_provides override. See Capacity Pools.
deriveCapabilities resolution order
// 1. runtime_provides column is non-null → operator override wins
// 2. PROVIDER_CLASS_DEFAULT_CAPABILITIES[providerId] → class default
// 3. Unknown provider → { runtimeKinds: [], requirementKinds: [] }
deriveCapabilities(providerId: string, runtimeProvidesOverride: unknown)Runner-in-box execution model
execCommand is not the kit-install path. Do not route toolchain installation through it.
Cloud sandboxes boot bare. The platform threads ComposedToolchainDemand onto the work item the donmai runtime polls. The runner - running inside the box with donmai agent run as its entrypoint - executes installCommands then postAcquire scripts via its local shell, after cloning the repo. A non-zero exit aborts the session.
Your provision() does not need to run any install steps. The box entrypoint must be donmai agent run:
- Docker / Kubernetes: baked into
ghcr.io/renseiai/donmai-worker:latestas the image ENTRYPOINT. - E2B / Daytona / Modal: operators must build a custom template that places the donmai binary at
/usr/local/bin/donmaiand setsdonmai agent runas the start command.
Kit demand shape (threaded, not executed by provision)
interface ComposedToolchainDemand {
kits: string[] // "kitId@version" entries, in compose order
os: string // 'linux' for all cloud providers
installCommands: string[] // ordered, deduplicated base toolchain installs
postAcquire: string[] // post-clone hooks (e.g. `npm ci`, `go mod download`)
preRelease: string[] // teardown hooks, best-effort
env: Record<string, string> // PATH augmentation, etc.
}Registering your provider
Add your provider class to SANDBOX_PROVIDERS in src/lib/providers/sandbox/registry.ts. No database migration is required.
// registry.ts
import { MyCustomSandboxProvider } from './my-custom'
const SANDBOX_PROVIDERS: SandboxProvider[] = [
new LocalSandboxProvider(),
new DockerSandboxProvider(),
// ... existing providers ...
new MyCustomSandboxProvider(), // add here
]The provider is immediately available. Create a capacity pool with providerId: 'my-custom' from Settings → Execution to begin routing sessions to it.
Auth-mode constraints
host-session and local auth modes pin to providerId='local' pools exclusively. Sessions using these modes fail dispatch if routed to any cloud provider pool.
validateResolvedProfile enforces this at dispatch time. Your provider does not need to handle this case - the platform rejects the session before calling provision.
Minimal provider skeleton
import type {
SandboxProvider,
SandboxProviderCapabilities,
SandboxSpec,
SandboxHandle,
SandboxStatus,
} from '../types'
import { resolveProviderCredential } from '../../resolve-credential'
import { emitSandboxCostEvent } from '@/lib/billing/sandbox-cost-emission'
export class MyCustomSandboxProvider implements SandboxProvider {
readonly id = 'my-custom'
readonly displayName = 'My Custom Provider'
readonly capabilities: SandboxProviderCapabilities = {
transportModel: 'dial-out',
supportsFsSnapshot: false,
supportsPauseResume: false,
supportsCapacityQuery: false,
maxConcurrent: null,
maxSessionDurationSeconds: null,
regions: ['*'],
os: ['linux'],
arch: ['x86_64'],
idleCostModel: 'zero',
billingModel: 'wall-clock',
maxVCpu: null,
maxMemoryMb: null,
supportsGpu: false,
supportsCustomNetworkPolicy: false,
egressDefault: 'allow-all',
isA2ARemote: false,
}
async provision(spec: SandboxSpec): Promise<SandboxHandle> {
const apiKey = await resolveProviderCredential(spec.orgId, 'my-custom')
// ... call your provider API ...
return {
providerId: 'my-custom',
externalId: 'provider-returned-id',
metadata: {
orgId: spec.orgId,
provisionedAt: new Date().toISOString(),
},
}
}
async status(_handle: SandboxHandle): Promise<SandboxStatus> {
return 'running'
}
async terminate(handle: SandboxHandle): Promise<void> {
const orgId = (handle.metadata.orgId as string) ?? ''
// ... stop the sandbox ...
await emitSandboxCostEvent({ handle, orgId })
}
}Capability matrix at a glance
The table below shows how the seven built-in providers declare their capabilities. Use it as a reference when writing your own declaration.
| Capability | local | docker | kubernetes | e2b | daytona | modal | vercel |
|---|---|---|---|---|---|---|---|
transportModel | either | either | either | dial-in | dial-in | dial-in | dial-in |
supportsFsSnapshot | - | - | - | yes | yes | yes | yes |
supportsPauseResume | - | - | - | yes | - | yes* | - |
supportsCapacityQuery | yes | yes | yes | - | - | - | - |
supportsGpu | - | - | - | - | - | yes | - |
supportsCustomNetworkPolicy | - | yes | yes | - | - | - | yes |
idleCostModel | zero | zero | metered | zero | storage-only | metered | metered |
billingModel | fixed | fixed | fixed | wall-clock | wall-clock | wall-clock | active-cpu |
maxSessionDurationSeconds | - | - | - | - | - | - | 18 000 s |
egressDefault | allow-all | allow-all | allow-all | allow-all | allow-all | allow-all | allow-all |
*Modal pause/resume is preview-tier.
End-to-end worked example
This walkthrough adds a hypothetical acme-cloud provider, registers it, and verifies sessions reach it.
1. Implement the provider
Create src/lib/providers/sandbox/acme/index.ts:
import type {
SandboxProvider,
SandboxProviderCapabilities,
SandboxSpec,
SandboxHandle,
SandboxStatus,
} from '../types'
import { resolveProviderCredential } from '../../resolve-credential'
import { emitSandboxCostEvent } from '@/lib/billing/sandbox-cost-emission'
export class AcmeCloudSandboxProvider implements SandboxProvider {
readonly id = 'acme-cloud'
readonly displayName = 'Acme Cloud'
readonly capabilities: SandboxProviderCapabilities = {
transportModel: 'dial-out', // worker dials out via registration token
supportsFsSnapshot: false,
supportsPauseResume: false,
supportsCapacityQuery: false,
maxConcurrent: null,
maxSessionDurationSeconds: null,
regions: ['us-east-1'],
os: ['linux'],
arch: ['x86_64'],
idleCostModel: 'zero',
billingModel: 'wall-clock',
maxVCpu: null,
maxMemoryMb: null,
supportsGpu: false,
supportsCustomNetworkPolicy: false,
egressDefault: 'allow-all',
isA2ARemote: false,
}
async provision(spec: SandboxSpec): Promise<SandboxHandle> {
// 1. Resolve the org's stored API key (Settings → Integrations)
const apiKey = await resolveProviderCredential(spec.orgId, 'acme-cloud')
// 2. Call Acme's sandbox-create endpoint
const response = await fetch('https://api.acme.example.com/v1/sandboxes', {
method: 'POST',
headers: {
Authorization: `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
image: spec.image ?? 'ghcr.io/renseiai/donmai-worker:latest',
env: spec.env, // includes RENSEI_REGISTRATION_TOKEN
cpu: spec.resources?.cpu ?? '1.0',
memoryMb: spec.resources?.memoryMB ?? 2048,
}),
})
const { sandboxId } = (await response.json()) as { sandboxId: string }
// 3. Return a handle - always store orgId in metadata for later calls
return {
providerId: 'acme-cloud',
externalId: sandboxId,
metadata: { orgId: spec.orgId, provisionedAt: new Date().toISOString() },
}
}
async status(handle: SandboxHandle): Promise<SandboxStatus> {
const apiKey = await resolveProviderCredential(
handle.metadata.orgId as string,
'acme-cloud',
)
const response = await fetch(
`https://api.acme.example.com/v1/sandboxes/${handle.externalId}`,
{ headers: { Authorization: `Bearer ${apiKey}` } },
)
const { state } = (await response.json()) as { state: string }
// Map provider-specific states to the platform's SandboxStatus enum
const STATUS_MAP: Record<string, SandboxStatus> = {
creating: 'provisioning',
starting: 'provisioning',
running: 'running',
stopped: 'terminated',
failed: 'failed',
}
return STATUS_MAP[state] ?? 'terminated'
}
async terminate(handle: SandboxHandle): Promise<void> {
const orgId = handle.metadata.orgId as string
const apiKey = await resolveProviderCredential(orgId, 'acme-cloud')
await fetch(
`https://api.acme.example.com/v1/sandboxes/${handle.externalId}`,
{
method: 'DELETE',
headers: { Authorization: `Bearer ${apiKey}` },
},
)
// Always emit cost events - non-fatal fire-and-forget
await emitSandboxCostEvent({ handle, orgId })
}
}2. Register the provider
Add one line to src/lib/providers/sandbox/registry.ts:
import { AcmeCloudSandboxProvider } from './acme'
const SANDBOX_PROVIDERS: SandboxProvider[] = [
new LocalSandboxProvider(),
new DockerSandboxProvider(),
// ... existing providers ...
new AcmeCloudSandboxProvider(), // new
]No database migration is required. The provider is immediately available to capacity pools.
3. Add credentials for the provider
In Settings → Integrations, add an acme-cloud credential. Operators can also set ACME_CLOUD_API_KEY as a platform environment variable as an org-level fallback.
4. Create a capacity pool
In Settings → Execution → Capacity, click New pool and choose acme-cloud. Give it a name and activate it. Optionally set a runtime_provides override if your Acme image bakes Go or Node toolchains:
{
"runtimeKinds": ["npm", "python-pip", "http", "mcp-server", "a2a-protocol"],
"requirementKinds": [
"long-running",
"network-egress",
"git",
"full-history-clone",
"toolchain:node"
]
}5. Verify session routing
Dispatch a test session and confirm it reaches your provider:
# Dispatch with the CLI and watch logs
rensei dispatch \
--project my-project \
--sandbox-pool acme-prod \
"echo hello from acme"
# Check the session detail
rensei session show <sessionId> --format json | jq '.sandboxProvider'
# → "acme-cloud"If sessions are not reaching your pool, check:
- Auth-mode pin - sessions using
host-sessionorlocalauth mode route exclusively tolocalpools. Usebyok,metered, orsharedfor cloud providers. - Substrate mismatch - the agent card's substrate requirements may not be satisfied by your pool's capability declaration. Inspect with
rensei capacity pools show <poolId> --substrate. - Credential missing -
resolveProviderCredentialreturns null if no credential is stored. Check Settings → Integrations or theACME_CLOUD_API_KEYenv var.
Related pages
- Capacity Pools - pool management,
runtime_providesoverrides, substrate resolution - Local - daemon workers,
host-session/localauth mode pin - Docker -
dockerode-based container provider - Kubernetes - Job-based provider
- E2B - pause/resume, the only
execCommandimplementation - Daytona - REST workspace provider
- Modal - GPU, 24h ceiling
- Vercel Sandbox - Firecracker microVMs