Skill Evolution
Per-agent proficiency and fleet gaps.
The Skill Evolution view tracks how individual agents' competency in different domains improves over time, helping you identify high-performing agents, training gaps, and opportunities for cross-agent knowledge transfer.
What Is Skill Proficiency?
Skill proficiency is a confidence score (0-1) per agent per domain, derived from:
- Observation success rate - How often agents' learned patterns led to positive outcomes
- Feedback patterns - User corrections and upvotes in that domain
- Recency - Recent successes are weighted more heavily than old ones
- Corroboration - Domains where the agent agrees with peers rank higher
Formula (simplified):
proficiency[agent, domain] = success_rate × feedback_boost × recency_weight × corroboration_factorViewing Skill Evolution
The Skill Evolution panel appears on the Memory Dashboard (row 4.5) and shows:
- Agent-domain matrix - Each agent's proficiency in key domains (architecture, security, performance, etc.)
- Trend lines - How proficiency has evolved over your selected time range (7d, 30d, 90d)
- Gaps - Domains where no agent has high proficiency (fleet-wide opportunity)
- Strengths - Domains where multiple agents excel
Common Patterns
Ramp-Up Curve
agent-research proficiency over 30 days:
architecture ▁▂▃▄▅▆▇▇▆▆ ← Learning phase, then plateau
performance ▁▁▁▂▂▂▃▃▃▃ ← Slower adoptionInterpretation: Agent is actively learning architecture early on, then specializing. Performance learning lags (possibly less exposure).
Steady High Performance
agent-qa security proficiency:
█████████▓ ← Consistently highInterpretation: Agent is reliably skilled; good candidate for peer mentoring or QA-critical decisions.
Declining Proficiency
agent-ops reliability over 90 days:
▇▇▆▆▅▄▃▂▁ ← Steady declineInterpretation: Agent may be disabled, out of scope, or receiving contradictory feedback. Investigate.
Fleet Gap
compliance domain across all agents:
agent-research ▂
agent-qa ▁
agent-ops ▃
agent-dev ▁
(no agent >0.4) ← Fleet is weak in complianceInterpretation: Organization needs to invest in compliance learning; consider a specialized compliance-focused agent.
Interpreting Low Proficiency
Low scores can mean:
- Limited exposure - Agent hasn't encountered many problems in that domain
- Learning difficulty - Domain is complex; agent needs better prompts or training data
- Conflicting feedback - Contradictory observations lower confidence
- Domain mismatch - Agent's skills don't naturally align with the domain
Action: Review the agent's recent observations in that domain; check feedback patterns.
Fleet Gaps
Domains where no agent has proficiency >0.5 are fleet-wide gaps. Common ones:
- Compliance/BFSI - Requires specialized legal/regulatory knowledge; difficult to learn from code alone
- DevOps - Narrow exposure; fewer code changes in deployment infrastructure
- Architecture - Long-term patterns; requires years of history to learn well
Mitigation strategies:
- Hire a specialist agent - Add an agent with deep domain knowledge (e.g., compliance officer, DevOps engineer)
- Inject observations manually - Seed high-confidence observations from experts
- Cross-project transfer - Borrow learned patterns from other organizations' memory systems (if authorized)
- Upstream training - Improve agent prompts/instructions to encourage learning in weak domains
API Reference
Endpoint: GET /api/memory/analytics/skill-evolution
Query Parameters:
workType(optional) - Filter profiles and gaps to a specific work type; omit for fleet-wide gap reportagentId(optional) - When combined withworkType, also returns that agent's weekly proficiency historywindowSize(optional) - Rolling-window session count for proficiency calculation (default: 50, max: 1000)
Response:
{
workType: string | null
windowSize: number
profiles: AgentSkillProfile[] // per-agent proficiency for the given workType
gaps: SkillGapReport[] // fleet-wide skill gaps (all work types when workType omitted)
history?: Array<{ // present only when agentId + workType are both supplied
bucketStart: string // ISO-8601 week start
proficiency: number
totalSessions: number
}>
agentProfile?: AgentSkillProfile // single-agent profile (when agentId supplied)
}See src/lib/a2a/skill-evolution.ts for the AgentSkillProfile and SkillGapReport type definitions.
Example:
curl -X GET "https://api.rensei.ai/api/memory/analytics/skill-evolution?workType=feature" \
-H "Authorization: Bearer rsk_..."Domains
Standard domains recognized by the system:
- architecture - System design, patterns, dependencies
- security - Vulnerabilities, safe practices, compliance
- performance - Optimization, scalability, bottlenecks
- reliability - Error handling, recovery, monitoring
- devops - Deployment, CI/CD, infrastructure
- testing - Test coverage, mocking, test strategies
- documentation - Code comments, API docs, onboarding
- compliance - Legal/regulatory requirements, audit trails
- coding-pattern - Style, idioms, conventions
Custom domains can be added via observation metadata.
Best Practices
- Monitor monthly - Skill evolution is a longer-term signal; weekly checks may be noisy
- Prioritize fleet gaps - Invest in closing critical gaps (especially security, compliance)
- Celebrate high performers - Share insights from top agents across your team
- Investigate declining agents - Quick intervention can restore a declining agent
- Cross-reference with A2A registry - Use Agent Registry to understand agent role and scope
Related Pages
- Memory Dashboard - Broader memory health context
- A2A Agent Registry - Detailed agent health and invocation stats
- Top Observations - High-confidence knowledge by source
- Agent Routing Intelligence - Thompson Sampling arm selection (uses skill proficiency)
Rate Limits
The skill-evolution API enforces a 100 req/min quota per organization. Skill calculations are cached for 1 hour.