Skill Evolution

The Skill Evolution view tracks how individual agents' competency in different domains improves over time, helping you identify high-performing agents, training gaps, and opportunities for cross-agent knowledge transfer.

What Is Skill Proficiency?

Skill proficiency is a confidence score (0-1) per agent per domain, derived from:

Observation success rate - How often agents' learned patterns led to positive outcomes
Feedback patterns - User corrections and upvotes in that domain
Recency - Recent successes are weighted more heavily than old ones
Corroboration - Domains where the agent agrees with peers rank higher

Formula (simplified):

proficiency[agent, domain] = success_rate × feedback_boost × recency_weight × corroboration_factor

Viewing Skill Evolution

The Skill Evolution panel appears on the Memory Dashboard (row 4.5) and shows:

Agent-domain matrix - Each agent's proficiency in key domains (architecture, security, performance, etc.)
Trend lines - How proficiency has evolved over your selected time range (7d, 30d, 90d)
Gaps - Domains where no agent has high proficiency (fleet-wide opportunity)
Strengths - Domains where multiple agents excel

Common Patterns

Ramp-Up Curve

agent-research proficiency over 30 days:
architecture  ▁▂▃▄▅▆▇▇▆▆  ← Learning phase, then plateau
performance   ▁▁▁▂▂▂▃▃▃▃  ← Slower adoption

Interpretation: Agent is actively learning architecture early on, then specializing. Performance learning lags (possibly less exposure).

Steady High Performance

agent-qa security proficiency:
                █████████▓  ← Consistently high

Interpretation: Agent is reliably skilled; good candidate for peer mentoring or QA-critical decisions.

Declining Proficiency

agent-ops reliability over 90 days:
         ▇▇▆▆▅▄▃▂▁         ← Steady decline

Interpretation: Agent may be disabled, out of scope, or receiving contradictory feedback. Investigate.

Fleet Gap

compliance domain across all agents:
agent-research  ▂
agent-qa        ▁
agent-ops       ▃
agent-dev       ▁
                    (no agent >0.4)  ← Fleet is weak in compliance

Interpretation: Organization needs to invest in compliance learning; consider a specialized compliance-focused agent.

Interpreting Low Proficiency

Low scores can mean:

Limited exposure - Agent hasn't encountered many problems in that domain
Learning difficulty - Domain is complex; agent needs better prompts or training data
Conflicting feedback - Contradictory observations lower confidence
Domain mismatch - Agent's skills don't naturally align with the domain

Action: Review the agent's recent observations in that domain; check feedback patterns.

Fleet Gaps

Domains where no agent has proficiency >0.5 are fleet-wide gaps. Common ones:

Compliance/BFSI - Requires specialized legal/regulatory knowledge; difficult to learn from code alone
DevOps - Narrow exposure; fewer code changes in deployment infrastructure
Architecture - Long-term patterns; requires years of history to learn well

Mitigation strategies:

Hire a specialist agent - Add an agent with deep domain knowledge (e.g., compliance officer, DevOps engineer)
Inject observations manually - Seed high-confidence observations from experts
Cross-project transfer - Borrow learned patterns from other organizations' memory systems (if authorized)
Upstream training - Improve agent prompts/instructions to encourage learning in weak domains

API Reference

Endpoint: GET /api/memory/analytics/skill-evolution

Query Parameters:

workType (optional) - Filter profiles and gaps to a specific work type; omit for fleet-wide gap report
agentId (optional) - When combined with workType, also returns that agent's weekly proficiency history
windowSize (optional) - Rolling-window session count for proficiency calculation (default: 50, max: 1000)

Response:

{
  workType: string | null
  windowSize: number
  profiles: AgentSkillProfile[]      // per-agent proficiency for the given workType
  gaps: SkillGapReport[]             // fleet-wide skill gaps (all work types when workType omitted)
  history?: Array<{                  // present only when agentId + workType are both supplied
    bucketStart: string              // ISO-8601 week start
    proficiency: number
    totalSessions: number
  }>
  agentProfile?: AgentSkillProfile   // single-agent profile (when agentId supplied)
}

See src/lib/a2a/skill-evolution.ts for the AgentSkillProfile and SkillGapReport type definitions.

Example:

curl -X GET "https://api.rensei.ai/api/memory/analytics/skill-evolution?workType=feature" \
  -H "Authorization: Bearer rsk_..."

Domains

Standard domains recognized by the system:

architecture - System design, patterns, dependencies
security - Vulnerabilities, safe practices, compliance
performance - Optimization, scalability, bottlenecks
reliability - Error handling, recovery, monitoring
devops - Deployment, CI/CD, infrastructure
testing - Test coverage, mocking, test strategies
documentation - Code comments, API docs, onboarding
compliance - Legal/regulatory requirements, audit trails
coding-pattern - Style, idioms, conventions

Custom domains can be added via observation metadata.

Best Practices

Monitor monthly - Skill evolution is a longer-term signal; weekly checks may be noisy
Prioritize fleet gaps - Invest in closing critical gaps (especially security, compliance)
Celebrate high performers - Share insights from top agents across your team
Investigate declining agents - Quick intervention can restore a declining agent
Cross-reference with A2A registry - Use Agent Registry to understand agent role and scope

Memory Dashboard - Broader memory health context
A2A Agent Registry - Detailed agent health and invocation stats
Top Observations - High-confidence knowledge by source
Agent Routing Intelligence - Thompson Sampling arm selection (uses skill proficiency)

Rate Limits

The skill-evolution API enforces a 100 req/min quota per organization. Skill calculations are cached for 1 hour.

On this page