Rensei docs

Modal Sandbox

GPU support, 24h ceiling.

The Modal sandbox provider provisions agent sessions as Modal sandboxes via the Modal HTTP API. Modal is the only Rensei sandbox provider with first-class GPU billing, making it the right choice for agents that require GPU compute (model inference, image processing, ML training).

How Modal sessions work

When the platform dispatches a session to a modal pool, ModalSandboxProvider.provision():

  1. Resolves the org's modal credential (format token_id:token_secret, or MODAL_TOKEN_ID + MODAL_TOKEN_SECRET env vars).
  2. Calls POST https://api.modal.com/v1/sandboxes with the image, entrypoint, env vars, CPU/memory/GPU, and timeout.
  3. Returns a SandboxHandle with the sandbox_id.

The sandbox entrypoint defaults to donmai agent run. The runner self-registers, polls for work, clones the repo, installs kits in-box, and runs the agent.

Capability profile

CapabilityValue
transportModeldial-in (Modal provides a direct connection endpoint)
supportsFsSnapshottrue
supportsPauseResumetrue (preview tier)
supportsCapacityQueryfalse (FaaS-style, opaque capacity)
maxConcurrentnull (tier-gated by Modal)
maxSessionDurationSecondsnull (tier-dependent; plan for a 24h practical ceiling)
oslinux
archx86_64, arm64
idleCostModelmetered (warm containers are billed for standby time)
billingModelwall-clock
supportsGputrue (first-class GPU billing)
supportsCustomNetworkPolicyfalse
egressDefaultallow-all

Modal bills for warm container standby even when the container is not actively executing. Sessions left in a warm state after the agent exits will continue to accrue cost until the platform calls terminate. Ensure your workflow terminates sessions promptly.

Substrate defaults

Modal class defaults do not include language toolchains:

  • Runtime kinds: npm, python-pip, http, mcp-server, a2a-protocol
  • Requirement kinds: long-running, network-egress, git, full-history-clone

To add toolchain:go or toolchain:node, bake them into your custom image and add them to the pool's runtime_provides override.

Setting up a Modal pool

Step 1: Add Modal credentials

In Settings → Integrations, find Modal and enter your token in token_id:token_secret format. Alternatively, set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET in your platform environment.

To generate a Modal token:

# Install Modal CLI
pip install modal

# Create a token
modal token new
# Copy the token_id and token_secret from the output

Step 2: Build a custom image (if needed)

The default ghcr.io/renseiai/donmai-worker:latest is a Linux container with the donmai binary. For GPU workloads, build from a CUDA base:

FROM nvidia/cuda:12.1.0-base-ubuntu22.04

# Install donmai binary
ARG DONMAI_VERSION=v0.11.0
RUN apt-get update && apt-get install -y curl && \
    curl -fsSL "https://github.com/renseiai/donmai/releases/download/${DONMAI_VERSION}/donmai_linux_amd64" \
    -o /usr/local/bin/donmai && chmod +x /usr/local/bin/donmai

ENTRYPOINT ["donmai", "agent", "run"]

Step 3: Create a Modal capacity pool

Navigate to Settings → Execution → Capacity → New pool → Modal and configure the pool.

Pool configuration

Config keyTypeDefaultDescription
imagestringghcr.io/renseiai/donmai-worker:latestContainer image
gpustring-GPU type (e.g. a10g, a100, h100). Omit for CPU-only.
entrypointstring[]["donmai", "agent", "run"]Sandbox entrypoint

Example pool config - CPU

{
  "image": "ghcr.io/renseiai/donmai-worker:latest"
}

Example pool config - GPU

{
  "image": "registry.example.com/donmai-worker-cuda:v0.11.0",
  "gpu": "a10g"
}

For GPU pools, also update runtime_provides to declare the gpu requirement:

{
  "runtimeKinds": ["npm", "python-pip", "http", "mcp-server", "a2a-protocol"],
  "requirementKinds": [
    "long-running",
    "network-egress",
    "git",
    "full-history-clone",
    "gpu"
  ]
}

Supported GPU types

Modal offers several GPU tiers. Specify the type string in config.gpu:

GPUModal type stringUse case
NVIDIA A10Ga10gGeneral inference, moderate training
NVIDIA A100a100Large model training/inference
NVIDIA H100h100Highest-performance training
NVIDIA T4t4Cost-effective inference

Resource configuration

Resources are set via spec.resources and pool defaults:

FieldDefaultDescription
cpu2.0vCPU count (float)
memoryMB2048Memory in MiB
timeoutSec3600Sandbox timeout in seconds
# Session with 4 vCPUs, 8 GiB RAM, A100 GPU
# Set via project sandbox_config or workflow node resource spec

Session status mapping

Modal statusPlatform status
creating, pendingprovisioning
running, readyrunning
error, failedfailed
othersterminated

Cost tracking

terminate emits a sandbox-seconds cost event via emitSandboxCostEvent. GPU hours are attributed per session in the factory cost breakdown dashboard.

Toolchain requirements

Modal has no class-default toolchains. Agents requiring toolchain:go or toolchain:node will not be routed to a Modal pool unless you bake the toolchain into the image and declare it via runtime_provides on the pool.

On this page