Rensei docs
Memory

Skill Evolution

Per-agent proficiency and fleet gaps.

The Skill Evolution view tracks how individual agents' competency in different domains improves over time, helping you identify high-performing agents, training gaps, and opportunities for cross-agent knowledge transfer.

What Is Skill Proficiency?

Skill proficiency is a confidence score (0-1) per agent per domain, derived from:

  1. Observation success rate - How often agents' learned patterns led to positive outcomes
  2. Feedback patterns - User corrections and upvotes in that domain
  3. Recency - Recent successes are weighted more heavily than old ones
  4. Corroboration - Domains where the agent agrees with peers rank higher

Formula (simplified):

proficiency[agent, domain] = success_rate × feedback_boost × recency_weight × corroboration_factor

Viewing Skill Evolution

The Skill Evolution panel appears on the Memory Dashboard (row 4.5) and shows:

  • Agent-domain matrix - Each agent's proficiency in key domains (architecture, security, performance, etc.)
  • Trend lines - How proficiency has evolved over your selected time range (7d, 30d, 90d)
  • Gaps - Domains where no agent has high proficiency (fleet-wide opportunity)
  • Strengths - Domains where multiple agents excel

Common Patterns

Ramp-Up Curve

agent-research proficiency over 30 days:
architecture  ▁▂▃▄▅▆▇▇▆▆  ← Learning phase, then plateau
performance   ▁▁▁▂▂▂▃▃▃▃  ← Slower adoption

Interpretation: Agent is actively learning architecture early on, then specializing. Performance learning lags (possibly less exposure).

Steady High Performance

agent-qa security proficiency:
                █████████▓  ← Consistently high

Interpretation: Agent is reliably skilled; good candidate for peer mentoring or QA-critical decisions.

Declining Proficiency

agent-ops reliability over 90 days:
         ▇▇▆▆▅▄▃▂▁         ← Steady decline

Interpretation: Agent may be disabled, out of scope, or receiving contradictory feedback. Investigate.

Fleet Gap

compliance domain across all agents:
agent-research  ▂
agent-qa        ▁
agent-ops       ▃
agent-dev       ▁
                    (no agent >0.4)  ← Fleet is weak in compliance

Interpretation: Organization needs to invest in compliance learning; consider a specialized compliance-focused agent.

Interpreting Low Proficiency

Low scores can mean:

  1. Limited exposure - Agent hasn't encountered many problems in that domain
  2. Learning difficulty - Domain is complex; agent needs better prompts or training data
  3. Conflicting feedback - Contradictory observations lower confidence
  4. Domain mismatch - Agent's skills don't naturally align with the domain

Action: Review the agent's recent observations in that domain; check feedback patterns.

Fleet Gaps

Domains where no agent has proficiency >0.5 are fleet-wide gaps. Common ones:

  • Compliance/BFSI - Requires specialized legal/regulatory knowledge; difficult to learn from code alone
  • DevOps - Narrow exposure; fewer code changes in deployment infrastructure
  • Architecture - Long-term patterns; requires years of history to learn well

Mitigation strategies:

  1. Hire a specialist agent - Add an agent with deep domain knowledge (e.g., compliance officer, DevOps engineer)
  2. Inject observations manually - Seed high-confidence observations from experts
  3. Cross-project transfer - Borrow learned patterns from other organizations' memory systems (if authorized)
  4. Upstream training - Improve agent prompts/instructions to encourage learning in weak domains

API Reference

Endpoint: GET /api/memory/analytics/skill-evolution

Query Parameters:

  • workType (optional) - Filter profiles and gaps to a specific work type; omit for fleet-wide gap report
  • agentId (optional) - When combined with workType, also returns that agent's weekly proficiency history
  • windowSize (optional) - Rolling-window session count for proficiency calculation (default: 50, max: 1000)

Response:

{
  workType: string | null
  windowSize: number
  profiles: AgentSkillProfile[]      // per-agent proficiency for the given workType
  gaps: SkillGapReport[]             // fleet-wide skill gaps (all work types when workType omitted)
  history?: Array<{                  // present only when agentId + workType are both supplied
    bucketStart: string              // ISO-8601 week start
    proficiency: number
    totalSessions: number
  }>
  agentProfile?: AgentSkillProfile   // single-agent profile (when agentId supplied)
}

See src/lib/a2a/skill-evolution.ts for the AgentSkillProfile and SkillGapReport type definitions.

Example:

curl -X GET "https://api.rensei.ai/api/memory/analytics/skill-evolution?workType=feature" \
  -H "Authorization: Bearer rsk_..."

Domains

Standard domains recognized by the system:

  • architecture - System design, patterns, dependencies
  • security - Vulnerabilities, safe practices, compliance
  • performance - Optimization, scalability, bottlenecks
  • reliability - Error handling, recovery, monitoring
  • devops - Deployment, CI/CD, infrastructure
  • testing - Test coverage, mocking, test strategies
  • documentation - Code comments, API docs, onboarding
  • compliance - Legal/regulatory requirements, audit trails
  • coding-pattern - Style, idioms, conventions

Custom domains can be added via observation metadata.

Best Practices

  1. Monitor monthly - Skill evolution is a longer-term signal; weekly checks may be noisy
  2. Prioritize fleet gaps - Invest in closing critical gaps (especially security, compliance)
  3. Celebrate high performers - Share insights from top agents across your team
  4. Investigate declining agents - Quick intervention can restore a declining agent
  5. Cross-reference with A2A registry - Use Agent Registry to understand agent role and scope

Rate Limits

The skill-evolution API enforces a 100 req/min quota per organization. Skill calculations are cached for 1 hour.

On this page