Model Catalog & Routing
Scope cascade × workType × pool model routing.
The model catalog is the platform's registry of LLM models, their providers, pricing, and capabilities. Routing rules determine which model is dispatched when a workflow, agent, or session requests an LLM without specifying one explicitly.
The Model Catalog
Every model the platform can dispatch must have a catalog entry. Entries are immutable once published and keyed by (provider, modelId).
Catalog Entry Schema
interface ModelCatalogEntryDTO {
id: string
provider: ProviderId // 'claude', 'codex', 'gemini', etc.
modelId: string // 'claude-opus-4-1', 'gpt-5', etc.
displayName: string // Human label shown in dropdowns
description: string | null
deprecated: boolean // Soft deprecation flag
replacedByCatalogId: string | null // Link to successor entry
efforts: EffortOption[] // Reasoning effort tiers if supported
authModes: AuthMode[] // 'byok' | 'metered' | 'shared' | 'host-session' | 'local'
contextWindow: number | null // Max input tokens
capabilityTags: string[] // ['long-context', 'reasoning', 'vision', ...]
pricing: ModelPricing | null // Input/output cost per M tokens, or flat rate
createdAt: string
updatedAt: string
}Example Entry
{
"id": "cat_claude_opus_4_1",
"provider": "claude",
"modelId": "claude-opus-4-1-20250514",
"displayName": "Claude Opus 4.1",
"description": "Latest Anthropic flagship for complex reasoning.",
"deprecated": false,
"replacedByCatalogId": null,
"efforts": [
{ "value": "low", "label": "Fast" },
{ "value": "medium", "label": "Balanced", "providerParam": { "reasoningEffort": "medium" } },
{ "value": "high", "label": "Deep", "providerParam": { "reasoningEffort": "high" } }
],
"authModes": ["byok", "metered", "shared", "host-session"],
"contextWindow": 200000,
"capabilityTags": ["long-context", "reasoning", "tool-use"],
"pricing": {
"inputPerMTokenCents": 3,
"outputPerMTokenCents": 15
}
}Managing the Catalog
Via CLI (operators only):
# List all entries
rensei catalog list
# Show one entry
rensei catalog show claude-opus-4-1
# Deprecate an entry (soft-deprecation; existing profiles still work)
rensei catalog deprecate claude-opus-4-1 --replaced-by cat_claude_opus_4_2
# Refresh catalog from seed data
rensei catalog syncVia UI:
Go to Settings → Model Profiles → Model Catalog. Click Add Model to create a new entry. All fields are validated against the provider registry (e.g., auth modes must be supported by the provider).
Via API:
The model catalog management API (/api/admin/model-catalog) is operator-only. Operator-level catalog configuration is covered in the operator docs.
Profiles: The Dispatch Configuration
A profile is a resolved model specification with auth mode, provider config, and scope. It's what you actually use when dispatching a workflow or session. Think of it as a "model choice" you can name and version.
Profile Schema
interface ProfileDTO {
id: string
scope: 'system' | 'org' | 'project' // Where it's defined
orgId: string | null
projectId: string | null
name: string // e.g. "default", "fast", "reasoning"
slug: string // e.g. "default", "fast", "reasoning"
description: string | null
provider: ProviderId // 'claude', 'codex', 'gemini', etc.
modelCatalogId: string | null // Link to catalog entry for pricing/capabilities
effort: string | null // 'low' | 'medium' | 'high' if catalog supports it
subAgent: SubAgentOverride | null // Optional: override for sub-agent routing
providerConfig: ProviderConfig // Provider-specific options (context window, etc.)
authMode: AuthMode // 'byok' | 'metered' | 'shared' | 'host-session' | 'local'
credentialId: string | null // For BYOK: link to the stored API key
archived: boolean
createdAt: string
updatedAt: string
}Creating a Profile
Via UI:
- Settings → Model Profiles → New Profile.
- Choose a name (e.g. "fast-claude", "reasoning-gpt").
- Select provider (Anthropic, OpenAI, etc.).
- Select auth mode.
- If BYOK: select the API key credential.
- If
local: enter the endpoint URL. - Select a model from the catalog (optional; you can also paste a custom model ID).
- Set reasoning effort if the model supports it.
- Advanced: add provider-specific config (context window overrides, etc.).
- Click Create.
Via CLI:
rensei profile create \
--name "fast-claude" \
--provider claude \
--model-id "claude-opus-4-1-20250514" \
--auth-mode metered \
--scope project \
--project-id my-projectVia API:
curl -X POST -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "fast-claude",
"provider": "claude",
"modelId": "claude-opus-4-1-20250514",
"authMode": "metered",
"effort": "low",
"scope": "project",
"projectId": "my-project"
}' \
https://app.rensei.ai/api/org/projects/my-project/profilesRouting: Scope Cascade and WorkType Overrides
When a workflow or agent requests an LLM without specifying a profile, the dispatcher resolves the best match using a cascade:
Resolution Order
- Explicit profile selection - If the workflow/session specifies
profileId, use that. - Node-level override - If an LLM node in a workflow sets
modelIdoreffort, use it (overrides profile default). - Project-scoped workType override - If a project has a work-type routing rule (e.g. "research → 'reasoning-gpt'"), use it.
- Project default - If the project has a default profile, use it.
- Org-scoped workType override - If the org has a work-type routing rule, use it.
- Org default - If the org has a default profile, use it.
- System default - The platform's hardcoded system default (Anthropic Claude with metered auth).
Work-Type Routing
Work types represent the lifecycle stage of a request (research, development, QA, acceptance). Route different models to different work types to balance cost and quality.
Example routing policy:
# research → use cheaper fast model
research:
provider: claude
modelId: claude-opus-4-1
effort: low # Fast reasoning
# development → use balanced model
development:
provider: claude
modelId: claude-opus-4-1
effort: medium
# qa → use expensive deep-reasoning model
qa:
provider: claude
modelId: claude-opus-4-1
effort: high
providerConfig:
contextWindow: 200000
# acceptance → human review (no LLM dispatch)
acceptance: nullSet work-type routing (UI):
- Settings → Model Profiles → Work-Type Routing.
- Select scope (org or project).
- For each work type, choose a default profile or model.
- Click Save.
Cedar Policy Enforcement
The Cedar policy engine intercepts profile resolution to enforce compliance rules. Example:
// No metered auth for regulated orgs
permit (principal, action == "agent:dispatch", resource)
if principal.org in ["regulated-org-1", "regulated-org-2"]
&& resource.profile.authMode == "metered"
then deny;If a policy denies the profile, dispatch fails with a clear error message.
Provider-Specific Config
Each provider accepts its own configuration options via the providerConfig block:
Anthropic
{
"anthropic": {
"contextWindow": 200000,
"cacheControl": true,
"budget": { "maxInputTokens": 100000 }
}
}OpenAI (Codex)
{
"openai": {
"serviceTier": "auto",
"endpoint": "https://api.openai.com/v1"
}
}Gemini
{
"gemini": {
"safetySettings": [
{ "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE" }
]
}
}Consult the provider's page (/docs/model-routing/providers/anthropic) for the full schema.
Deprecation & Migration
When a model reaches end-of-life, mark it as deprecated and point to a successor:
rensei catalog deprecate claude-3-sonnet-20240229 \
--replaced-by cat_claude_opus_4_1What happens:
- The model is hidden from the "New Profile" dropdown.
- Existing profiles still reference it (backward compatibility).
- Workflows/agents that use it still work.
- The UI shows a migration hint: "This model is deprecated. Consider switching to Claude Opus 4.1."
When you're ready to hard-block an old model, use:
rensei catalog archive claude-3-sonnet-20240229This prevents new profiles from referencing it, but does not break existing profiles. Only archive after sufficient migration time.
Cost Insights
View cost by model/provider:
# Cost rollup over a rolling window: total, by-provider, by-pool
rensei capacity cost
# Widen the window (default 24h)
rensei capacity cost --window=7d
rensei capacity cost --window=30d
# Machine-readable
rensei capacity cost --jsonFiner-grained slices (by model, by work type, by auth mode) live in the Factory Analytics cost breakdown on the platform UI and its metrics API.
Use this data to:
- Optimize work-type routing (e.g., cheaper models for research stages).
- Identify runaway models and deprecate them.
- Forecast budget for upcoming deployments.
The OSS two-axis provider model
The execution layer underneath all of this is the open-source donmai runner, and its provider architecture is documented canonically on donmai.dev - read those pages for how a run is actually assembled:
- Providers - the two-axis model - a run
pairs a harness (the loop driver: Claude Code, Codex, OpenCode,
Antigravity, Amp, or the in-box
rawloop) with a model endpoint (the company serving the model: Anthropic, OpenAI, Google, or a local server). - Capability matrix - which harness × endpoint cells are valid. The matrix is computed from each side's declared transports and auth modes, never hand-authored.
The platform's catalog/profile/routing layer documented on this page sits on
top of those cells: a catalog entry's provider + auth mode resolves to one
cell of the OSS matrix at dispatch.
Google is one provider, two cells. The platform collapsed the former
gemini-cli provider into gemini: key-based auth modes (byok / metered /
shared) run API-direct, while local / host-session rewrite at dispatch to
the Antigravity agy CLI harness under the user's own Google subscription. See
Gemini provider for the full mapping.
Quick Reference
| Concept | Definition | Where to set |
|---|---|---|
| Catalog entry | Immutable registry of a model's capabilities and pricing | Settings → Model Catalog |
| Profile | Your named choice of model + auth mode + config | Settings → Model Profiles |
| Default profile | The profile used if no other routing rule matches | Settings → Model Profiles → Defaults |
| Work-type routing | Model selection by lifecycle stage (research → fast, QA → deep) | Settings → Model Profiles → Work-Type Routing |
| Provider config | Model-specific tuning (context window, safety, effort) | Profile editor → Advanced |