Feedback Retrieval

Feedback-weighted retrieval adjusts the ordering of observation candidates at retrieval time based on accumulated session outcomes. Observations that consistently appear in successful sessions are promoted; those that appear frequently in failed sessions are demoted. The adjustment is applied on top of vector-similarity scores, so semantic relevance still dominates the initial candidate set.

The weighted score formula

weightedScore = relevanceScore × feedbackWeight

Where:

relevanceScore - cosine-similarity distance from the pgvector store (or keyword match score)
feedbackWeight - the EMA weight stored in observations.weight (starts at 1.0, updated after each session outcome via feedback.ts)

When feedback_weighting_disabled: true is set in the org's features JSON, the formula reduces to weightedScore = relevanceScore (weights are ignored for ordering but still logged).

applyFeedbackRanking

The core function lives in retrieval-feedback.ts and is called inside buildSessionMemoryBlock:

import { applyFeedbackRanking } from '@/lib/memory/retrieval-feedback'

const candidates = rawObservations.map((obs) => ({
  id: obs.id,
  relevanceScore: obs.score,
  feedbackWeight: obs.weight,
}))

const ranked = applyFeedbackRanking(candidates, { weightingEnabled: true })
// ranked[0] has the highest weightedScore
// Each item also carries unweightedRank and weightedRank for A/B comparison

AdjustedCandidate

interface AdjustedCandidate extends CandidateForRanking {
  weightedScore: number
  /** 0-based rank in pure-relevance ordering (weights ignored). */
  unweightedRank: number
  /** 0-based rank in feedback-weighted ordering. */
  weightedRank: number
}

The unweightedRank vs weightedRank delta is what the Feedback Impact analytics view visualises as "rank delta."

A/B logging

Every retrieval with a sessionId writes one row per candidate to retrieval_ab_logs. This enables the Feedback Impact sub-view to compare pre-weighting vs post-weighting position for each retrieved observation:

await logRetrievalAB({
  sessionId: 'ses_abc',
  orgId: 'org_123',
  projectId: 'proj_xyz',
  queryText: 'fix null pointer in auth middleware',
  candidates: ranked,
  weightingEnabled: true,
})

logRetrievalAB is non-fatal - failures are swallowed with a console.warn. Callers pass skipAbLog: true to the injection builder when they do not want the log overhead (e.g. explicit recall-tool calls).

Columns in retrieval_ab_logs:

Column	Description
`session_id`	The session that triggered the retrieval
`org_id`, `project_id`	Tenant scope
`query_text`	The search query
`observation_id`	UUID of the candidate observation
`relevance_score`	Raw vector similarity score
`feedback_weight`	`observations.weight` at retrieval time
`weighted_score`	`relevanceScore × feedbackWeight`
`unweighted_rank`	0-based rank without feedback adjustment
`weighted_rank`	0-based rank with feedback adjustment
`weighting_enabled`	Whether the org had weighting active

Org-level toggle

import { isFeedbackWeightingEnabled } from '@/lib/memory/retrieval-feedback'

const enabled = await isFeedbackWeightingEnabled(orgId)

The toggle reads feedback_weighting_disabled from the org's resolved feature flags. It defaults to true (enabled) when the flag is absent or the orgId is undefined.

To disable weighting for an org, set the feature flag via the admin panel or directly in organizations.features:

{
  "feedback_weighting_disabled": true
}

Integration point

applyFeedbackRanking is wired inside buildSessionMemoryBlock in context-injection.ts. The retrieval flow is:

Retrieve up to 20 raw observation candidates from the store (semantic or keyword).

Map to CandidateForRanking[] using the observation's weight column as feedbackWeight.

Call applyFeedbackRanking(candidates, { weightingEnabled }) to produce AdjustedCandidate[] sorted by weightedScore descending.

Re-order ScoredObservation[] to match the new ranked order, replacing score with weightedScore.

Pass re-ordered observations to formatObservationsBlock for token-budget trimming.

Log candidates to retrieval_ab_logs asynchronously.

The full live ordering priority is:

Vector similarity (pgvector cosine)
Feedback weight multiplication (× feedbackWeight)

The corroboration boost (log₂(1 + corroborationScore) × 0.1) is designed to slot between the two, but its module is not wired into this path yet.

Retrieval Ranking - corroboration-boost module (built, not yet wired into this path)
Context Injection - the primary call site for this module
Feedback Retention Audit - EMA weight updates that drive feedbackWeight
Memory Diagnostics - aggressive 2x downweight for misleading observations
Feedback Impact - analytics view over retrieval_ab_logs