Modal Sandbox
GPU support, 24h ceiling.
The Modal sandbox provider provisions agent sessions as Modal sandboxes via the Modal HTTP API. Modal is the only Rensei sandbox provider with first-class GPU billing, making it the right choice for agents that require GPU compute (model inference, image processing, ML training).
How Modal sessions work
When the platform dispatches a session to a modal pool, ModalSandboxProvider.provision():
- Resolves the org's
modalcredential (formattoken_id:token_secret, orMODAL_TOKEN_ID+MODAL_TOKEN_SECRETenv vars). - Calls
POST https://api.modal.com/v1/sandboxeswith the image, entrypoint, env vars, CPU/memory/GPU, and timeout. - Returns a
SandboxHandlewith thesandbox_id.
The sandbox entrypoint defaults to donmai agent run. The runner self-registers, polls for work, clones the repo, installs kits in-box, and runs the agent.
Capability profile
| Capability | Value |
|---|---|
transportModel | dial-in (Modal provides a direct connection endpoint) |
supportsFsSnapshot | true |
supportsPauseResume | true (preview tier) |
supportsCapacityQuery | false (FaaS-style, opaque capacity) |
maxConcurrent | null (tier-gated by Modal) |
maxSessionDurationSeconds | null (tier-dependent; plan for a 24h practical ceiling) |
os | linux |
arch | x86_64, arm64 |
idleCostModel | metered (warm containers are billed for standby time) |
billingModel | wall-clock |
supportsGpu | true (first-class GPU billing) |
supportsCustomNetworkPolicy | false |
egressDefault | allow-all |
Modal bills for warm container standby even when the container is not actively executing. Sessions left in a warm state after the agent exits will continue to accrue cost until the platform calls terminate. Ensure your workflow terminates sessions promptly.
Substrate defaults
Modal class defaults do not include language toolchains:
- Runtime kinds: npm, python-pip, http, mcp-server, a2a-protocol
- Requirement kinds: long-running, network-egress, git, full-history-clone
To add toolchain:go or toolchain:node, bake them into your custom image and add them to the pool's runtime_provides override.
Setting up a Modal pool
Step 1: Add Modal credentials
In Settings → Integrations, find Modal and enter your token in token_id:token_secret format. Alternatively, set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET in your platform environment.
To generate a Modal token:
# Install Modal CLI
pip install modal
# Create a token
modal token new
# Copy the token_id and token_secret from the outputStep 2: Build a custom image (if needed)
The default ghcr.io/renseiai/donmai-worker:latest is a Linux container with the donmai binary. For GPU workloads, build from a CUDA base:
FROM nvidia/cuda:12.1.0-base-ubuntu22.04
# Install donmai binary
ARG DONMAI_VERSION=v0.11.0
RUN apt-get update && apt-get install -y curl && \
curl -fsSL "https://github.com/renseiai/donmai/releases/download/${DONMAI_VERSION}/donmai_linux_amd64" \
-o /usr/local/bin/donmai && chmod +x /usr/local/bin/donmai
ENTRYPOINT ["donmai", "agent", "run"]Step 3: Create a Modal capacity pool
Navigate to Settings → Execution → Capacity → New pool → Modal and configure the pool.
Pool configuration
| Config key | Type | Default | Description |
|---|---|---|---|
image | string | ghcr.io/renseiai/donmai-worker:latest | Container image |
gpu | string | - | GPU type (e.g. a10g, a100, h100). Omit for CPU-only. |
entrypoint | string[] | ["donmai", "agent", "run"] | Sandbox entrypoint |
Example pool config - CPU
{
"image": "ghcr.io/renseiai/donmai-worker:latest"
}Example pool config - GPU
{
"image": "registry.example.com/donmai-worker-cuda:v0.11.0",
"gpu": "a10g"
}For GPU pools, also update runtime_provides to declare the gpu requirement:
{
"runtimeKinds": ["npm", "python-pip", "http", "mcp-server", "a2a-protocol"],
"requirementKinds": [
"long-running",
"network-egress",
"git",
"full-history-clone",
"gpu"
]
}Supported GPU types
Modal offers several GPU tiers. Specify the type string in config.gpu:
| GPU | Modal type string | Use case |
|---|---|---|
| NVIDIA A10G | a10g | General inference, moderate training |
| NVIDIA A100 | a100 | Large model training/inference |
| NVIDIA H100 | h100 | Highest-performance training |
| NVIDIA T4 | t4 | Cost-effective inference |
Resource configuration
Resources are set via spec.resources and pool defaults:
| Field | Default | Description |
|---|---|---|
cpu | 2.0 | vCPU count (float) |
memoryMB | 2048 | Memory in MiB |
timeoutSec | 3600 | Sandbox timeout in seconds |
# Session with 4 vCPUs, 8 GiB RAM, A100 GPU
# Set via project sandbox_config or workflow node resource specSession status mapping
| Modal status | Platform status |
|---|---|
creating, pending | provisioning |
running, ready | running |
error, failed | failed |
| others | terminated |
Cost tracking
terminate emits a sandbox-seconds cost event via emitSandboxCostEvent. GPU hours are attributed per session in the factory cost breakdown dashboard.
Toolchain requirements
Modal has no class-default toolchains. Agents requiring toolchain:go or toolchain:node will not be routed to a Modal pool unless you bake the toolchain into the image and declare it via runtime_provides on the pool.
Related pages
- Capacity Pools - pool management and
runtime_providesoverrides - E2B - true pause/resume, lower idle cost
- Add a Provider - SandboxProvider interface