Kubernetes Sandbox

The Kubernetes sandbox provider provisions agent sessions as Kubernetes Jobs. Jobs are the right primitive for run-to-completion workloads: they track success/failure, support restart policies, and TTL-based cleanup handles orphaned work automatically.

How Kubernetes sessions work

When the platform dispatches a session to a kubernetes pool, KubernetesSandboxProvider.provision():

Loads the kubeconfig from the pool's config (inline YAML, file path, or the default ServiceAccount / ~/.kube/config).
Creates a batch/v1 Job in the configured namespace.
The Job's Pod runs ghcr.io/renseiai/donmai-worker:latest with the registration token and injected env vars.
Returns a SandboxHandle with the job name.

The Job runs with backoffLimit: 0 (no automatic retries) and restartPolicy: Never. TTL cleanup uses ttlSecondsAfterFinished so completed Jobs do not accumulate.

Capability profile

Capability	Value
`transportModel`	`either` (dial-in via `kubectl exec` or dial-out via token)
`supportsFsSnapshot`	false (PVC volume snapshots optional, not wired)
`supportsPauseResume`	false
`supportsCapacityQuery`	true (`kubectl top` + ResourceQuotas)
`maxConcurrent`	null (cluster-limited)
`maxSessionDurationSeconds`	null (no platform ceiling; `ttlSecondsAfterFinished` handles cleanup)
`os`	linux
`arch`	x86_64, arm64
`idleCostModel`	metered (reserved nodes accrue cost)
`billingModel`	fixed (cluster nodes already provisioned)
`supportsGpu`	false (requires GPU nodes with `nodeSelector` + tolerations)
`supportsCustomNetworkPolicy`	true (Kubernetes NetworkPolicy)
`egressDefault`	allow-all

Substrate defaults

Kubernetes pools satisfy the same substrate requirements as Docker pools:

Runtime kinds: native, npm, python-pip, http, mcp-server, a2a-protocol, workarea
Requirement kinds: persistent-storage, long-running, workarea, network-egress, git, full-history-clone, toolchain:go, toolchain:node

Note: host-binary is not a Kubernetes class default.

Prerequisites

A running Kubernetes cluster accessible from the platform host.
The rensei-workers namespace (or your configured namespace) must exist.
The worker image ghcr.io/renseiai/donmai-worker:latest must be pullable from the cluster. If the package is private on GitHub Container Registry, configure an imagePullSecret.
The platform host needs network access to the cluster API server.

Namespace setup

# Create the worker namespace
kubectl create namespace rensei-workers

# Optional: resource quota to cap concurrent resource usage
kubectl apply -f - <<EOF
apiVersion: v1
kind: ResourceQuota
metadata:
  name: rensei-worker-quota
  namespace: rensei-workers
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    pods: "20"
EOF

Pod labels

Every pod created by the Kubernetes provider is labelled:

rensei.io/project-id: <projectId>
rensei.io/org-id: <orgId>
rensei.io/managed-by: rensei-platform
app.kubernetes.io/component: worker

Find platform-managed pods:

kubectl get pods -n rensei-workers -l rensei.io/managed-by=rensei-platform

Pool configuration

Create a Kubernetes pool at Settings → Execution → Capacity → New pool → Kubernetes.

Config key	Type	Default	Description
`kubeconfig`	string	-	Inline kubeconfig YAML
`kubeconfigPath`	string	-	Path to kubeconfig file on the platform host
`namespace`	string	`rensei-workers`	Namespace for Job creation
`image`	string	`ghcr.io/renseiai/donmai-worker:latest`	Worker container image
`serviceAccountName`	string	-	Service account for the Job pod
`memoryMB`	number	`2048`	Memory request and limit per pod in MiB
`cpu`	string	`"1.0"`	CPU request and limit (e.g. `"2.0"`, `"500m"`)

If neither kubeconfig nor kubeconfigPath is set, KubeConfig.loadFromDefault() is called, which uses the in-cluster ServiceAccount (when the platform runs inside Kubernetes) or ~/.kube/config.

Example pool config

{
  "kubeconfig": "apiVersion: v1\nkind: Config\nclusters:\n- ...",
  "namespace": "rensei-workers",
  "image": "ghcr.io/renseiai/donmai-worker:v0.11.0",
  "serviceAccountName": "rensei-worker-sa",
  "memoryMB": 4096,
  "cpu": "2.0"
}

Job spec details

The provider generates a Job with these characteristics:

apiVersion: batch/v1
kind: Job
metadata:
  name: rensei-<projectId[0:8]>-<timestamp>
  namespace: rensei-workers
  labels:
    rensei.io/project-id: <projectId>
    rensei.io/managed-by: rensei-platform
    app.kubernetes.io/component: worker
spec:
  backoffLimit: 0        # no retries - session failures surface cleanly
  ttlSecondsAfterFinished: <timeoutSec>
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: worker
          image: ghcr.io/renseiai/donmai-worker:latest
          env: [...]     # registration token + project env
          resources:
            requests: { cpu: "1.0", memory: "2048Mi" }
            limits:   { cpu: "1.0", memory: "2048Mi" }

Log streaming

streamLogs finds the pod for the Job by label selector (job-name=<jobName>) and follows the worker container's logs using the Kubernetes pod log API. The log stream is available in the session detail view.

Network policy

Apply a NetworkPolicy to the rensei-workers namespace to restrict egress. Example - allow only HTTPS outbound:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: rensei-worker-egress
  namespace: rensei-workers
spec:
  podSelector:
    matchLabels:
      rensei.io/managed-by: rensei-platform
  policyTypes: [Egress]
  egress:
    - ports:
        - protocol: TCP
          port: 443

GPU sessions

To enable GPU sessions on a Kubernetes pool, add GPU nodes to the cluster with the NVIDIA device plugin and update the pool's runtime_provides override:

{
  "runtimeKinds": ["native", "npm", "python-pip", "http", "mcp-server", "a2a-protocol", "workarea"],
  "requirementKinds": [
    "persistent-storage", "long-running", "workarea", "network-egress",
    "git", "full-history-clone", "toolchain:go", "toolchain:node",
    "gpu"
  ]
}

You will also need to add GPU resource requests to containers, which requires modifying the Job spec via a pool config extension. Contact Rensei support for guidance on GPU-enabled Kubernetes pools.

Capacity Pools - pool management and substrate resolution
Docker - simpler container provider for single-host setups
Add a Provider - SandboxProvider interface

On this page