Rensei docs

Model Catalog & Routing

Scope cascade × workType × pool model routing.

The model catalog is the platform's registry of LLM models, their providers, pricing, and capabilities. Routing rules determine which model is dispatched when a workflow, agent, or session requests an LLM without specifying one explicitly.

The Model Catalog

Every model the platform can dispatch must have a catalog entry. Entries are immutable once published and keyed by (provider, modelId).

Catalog Entry Schema

interface ModelCatalogEntryDTO {
  id: string
  provider: ProviderId                 // 'claude', 'codex', 'gemini', etc.
  modelId: string                      // 'claude-opus-4-1', 'gpt-5', etc.
  displayName: string                  // Human label shown in dropdowns
  description: string | null
  deprecated: boolean                  // Soft deprecation flag
  replacedByCatalogId: string | null   // Link to successor entry
  efforts: EffortOption[]               // Reasoning effort tiers if supported
  authModes: AuthMode[]                 // 'byok' | 'metered' | 'shared' | 'host-session' | 'local'
  contextWindow: number | null          // Max input tokens
  capabilityTags: string[]              // ['long-context', 'reasoning', 'vision', ...]
  pricing: ModelPricing | null          // Input/output cost per M tokens, or flat rate
  createdAt: string
  updatedAt: string
}

Example Entry

{
  "id": "cat_claude_opus_4_1",
  "provider": "claude",
  "modelId": "claude-opus-4-1-20250514",
  "displayName": "Claude Opus 4.1",
  "description": "Latest Anthropic flagship for complex reasoning.",
  "deprecated": false,
  "replacedByCatalogId": null,
  "efforts": [
    { "value": "low", "label": "Fast" },
    { "value": "medium", "label": "Balanced", "providerParam": { "reasoningEffort": "medium" } },
    { "value": "high", "label": "Deep", "providerParam": { "reasoningEffort": "high" } }
  ],
  "authModes": ["byok", "metered", "shared", "host-session"],
  "contextWindow": 200000,
  "capabilityTags": ["long-context", "reasoning", "tool-use"],
  "pricing": {
    "inputPerMTokenCents": 3,
    "outputPerMTokenCents": 15
  }
}

Managing the Catalog

Via CLI (operators only):

# List all entries
rensei catalog list

# Show one entry
rensei catalog show claude-opus-4-1

# Deprecate an entry (soft-deprecation; existing profiles still work)
rensei catalog deprecate claude-opus-4-1 --replaced-by cat_claude_opus_4_2

# Refresh catalog from seed data
rensei catalog sync

Via UI:

Go to Settings → Model Profiles → Model Catalog. Click Add Model to create a new entry. All fields are validated against the provider registry (e.g., auth modes must be supported by the provider).

Via API:

The model catalog management API (/api/admin/model-catalog) is operator-only. Operator-level catalog configuration is covered in the operator docs.


Profiles: The Dispatch Configuration

A profile is a resolved model specification with auth mode, provider config, and scope. It's what you actually use when dispatching a workflow or session. Think of it as a "model choice" you can name and version.

Profile Schema

interface ProfileDTO {
  id: string
  scope: 'system' | 'org' | 'project'       // Where it's defined
  orgId: string | null
  projectId: string | null
  name: string                                // e.g. "default", "fast", "reasoning"
  slug: string                                // e.g. "default", "fast", "reasoning"
  description: string | null
  provider: ProviderId                        // 'claude', 'codex', 'gemini', etc.
  modelCatalogId: string | null               // Link to catalog entry for pricing/capabilities
  effort: string | null                       // 'low' | 'medium' | 'high' if catalog supports it
  subAgent: SubAgentOverride | null           // Optional: override for sub-agent routing
  providerConfig: ProviderConfig              // Provider-specific options (context window, etc.)
  authMode: AuthMode                          // 'byok' | 'metered' | 'shared' | 'host-session' | 'local'
  credentialId: string | null                 // For BYOK: link to the stored API key
  archived: boolean
  createdAt: string
  updatedAt: string
}

Creating a Profile

Via UI:

  1. Settings → Model Profiles → New Profile.
  2. Choose a name (e.g. "fast-claude", "reasoning-gpt").
  3. Select provider (Anthropic, OpenAI, etc.).
  4. Select auth mode.
  5. If BYOK: select the API key credential.
  6. If local: enter the endpoint URL.
  7. Select a model from the catalog (optional; you can also paste a custom model ID).
  8. Set reasoning effort if the model supports it.
  9. Advanced: add provider-specific config (context window overrides, etc.).
  10. Click Create.

Via CLI:

rensei profile create \
  --name "fast-claude" \
  --provider claude \
  --model-id "claude-opus-4-1-20250514" \
  --auth-mode metered \
  --scope project \
  --project-id my-project

Via API:

curl -X POST -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "fast-claude",
    "provider": "claude",
    "modelId": "claude-opus-4-1-20250514",
    "authMode": "metered",
    "effort": "low",
    "scope": "project",
    "projectId": "my-project"
  }' \
  https://app.rensei.ai/api/org/projects/my-project/profiles

Routing: Scope Cascade and WorkType Overrides

When a workflow or agent requests an LLM without specifying a profile, the dispatcher resolves the best match using a cascade:

Resolution Order

  1. Explicit profile selection - If the workflow/session specifies profileId, use that.
  2. Node-level override - If an LLM node in a workflow sets modelId or effort, use it (overrides profile default).
  3. Project-scoped workType override - If a project has a work-type routing rule (e.g. "research → 'reasoning-gpt'"), use it.
  4. Project default - If the project has a default profile, use it.
  5. Org-scoped workType override - If the org has a work-type routing rule, use it.
  6. Org default - If the org has a default profile, use it.
  7. System default - The platform's hardcoded system default (Anthropic Claude with metered auth).

Work-Type Routing

Work types represent the lifecycle stage of a request (research, development, QA, acceptance). Route different models to different work types to balance cost and quality.

Example routing policy:

# research → use cheaper fast model
research:
  provider: claude
  modelId: claude-opus-4-1
  effort: low          # Fast reasoning

# development → use balanced model
development:
  provider: claude
  modelId: claude-opus-4-1
  effort: medium

# qa → use expensive deep-reasoning model
qa:
  provider: claude
  modelId: claude-opus-4-1
  effort: high
  providerConfig:
    contextWindow: 200000

# acceptance → human review (no LLM dispatch)
acceptance: null

Set work-type routing (UI):

  1. Settings → Model Profiles → Work-Type Routing.
  2. Select scope (org or project).
  3. For each work type, choose a default profile or model.
  4. Click Save.

Cedar Policy Enforcement

The Cedar policy engine intercepts profile resolution to enforce compliance rules. Example:

// No metered auth for regulated orgs
permit (principal, action == "agent:dispatch", resource)
if principal.org in ["regulated-org-1", "regulated-org-2"]
&& resource.profile.authMode == "metered"
then deny;

If a policy denies the profile, dispatch fails with a clear error message.


Provider-Specific Config

Each provider accepts its own configuration options via the providerConfig block:

Anthropic

{
  "anthropic": {
    "contextWindow": 200000,
    "cacheControl": true,
    "budget": { "maxInputTokens": 100000 }
  }
}

OpenAI (Codex)

{
  "openai": {
    "serviceTier": "auto",
    "endpoint": "https://api.openai.com/v1"
  }
}

Gemini

{
  "gemini": {
    "safetySettings": [
      { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE" }
    ]
  }
}

Consult the provider's page (/docs/model-routing/providers/anthropic) for the full schema.


Deprecation & Migration

When a model reaches end-of-life, mark it as deprecated and point to a successor:

rensei catalog deprecate claude-3-sonnet-20240229 \
  --replaced-by cat_claude_opus_4_1

What happens:

  • The model is hidden from the "New Profile" dropdown.
  • Existing profiles still reference it (backward compatibility).
  • Workflows/agents that use it still work.
  • The UI shows a migration hint: "This model is deprecated. Consider switching to Claude Opus 4.1."

When you're ready to hard-block an old model, use:

rensei catalog archive claude-3-sonnet-20240229

This prevents new profiles from referencing it, but does not break existing profiles. Only archive after sufficient migration time.


Cost Insights

View cost by model/provider:

# Cost rollup over a rolling window: total, by-provider, by-pool
rensei capacity cost

# Widen the window (default 24h)
rensei capacity cost --window=7d
rensei capacity cost --window=30d

# Machine-readable
rensei capacity cost --json

Finer-grained slices (by model, by work type, by auth mode) live in the Factory Analytics cost breakdown on the platform UI and its metrics API.

Use this data to:

  • Optimize work-type routing (e.g., cheaper models for research stages).
  • Identify runaway models and deprecate them.
  • Forecast budget for upcoming deployments.

The OSS two-axis provider model

The execution layer underneath all of this is the open-source donmai runner, and its provider architecture is documented canonically on donmai.dev - read those pages for how a run is actually assembled:

  • Providers - the two-axis model - a run pairs a harness (the loop driver: Claude Code, Codex, OpenCode, Antigravity, Amp, or the in-box raw loop) with a model endpoint (the company serving the model: Anthropic, OpenAI, Google, or a local server).
  • Capability matrix - which harness × endpoint cells are valid. The matrix is computed from each side's declared transports and auth modes, never hand-authored.

The platform's catalog/profile/routing layer documented on this page sits on top of those cells: a catalog entry's provider + auth mode resolves to one cell of the OSS matrix at dispatch.

Google is one provider, two cells. The platform collapsed the former gemini-cli provider into gemini: key-based auth modes (byok / metered / shared) run API-direct, while local / host-session rewrite at dispatch to the Antigravity agy CLI harness under the user's own Google subscription. See Gemini provider for the full mapping.

Quick Reference

ConceptDefinitionWhere to set
Catalog entryImmutable registry of a model's capabilities and pricingSettings → Model Catalog
ProfileYour named choice of model + auth mode + configSettings → Model Profiles
Default profileThe profile used if no other routing rule matchesSettings → Model Profiles → Defaults
Work-type routingModel selection by lifecycle stage (research → fast, QA → deep)Settings → Model Profiles → Work-Type Routing
Provider configModel-specific tuning (context window, safety, effort)Profile editor → Advanced

On this page