In-Session Injection
Layer-6 verb-bus injection.
In-session injection delivers memory observations to an agent while a session is running, not just at its start. The InSessionMemoryInjector subscribes to the Layer-6 verb bus and, on each tool call, retrieves contextually relevant observations and injects them into the agent's context mid-flight.
In-session injection is additive to start-of-session context injection. Both paths can run simultaneously. Observations injected mid-session are deduplicated against those already delivered at session start.
Production wiring
In the deployed platform the injector is wired by memory-injection-subscriber.ts (a lazy module-singleton):
- It subscribes the injector to the global hook bus on
pre-tool-use/post-tool-useprovider hook events (the daemon bridges these cross-process), translating each into theVerbBusEventshape below. injectMessageis implemented as an enqueue onto the durablesession_memory_injectsqueue - the session's owning worker pulls the block and applies it via the daemon'shandle.Injecton its next lock-refresh beat. The Go daemon does not expose a live mid-streaminjectMessage, so durability-by-queue is the production delivery path.- Delivery is gated per session by the project's
runtimeInjectEnabledmemory sub-toggle (the per-project tenant authority - there is no process-global env gate). When it is off, retrieval/matching still runs but nothing is pushed. - The session must be
running/claimedwith an agent attached; self-dispatched stages that never run an agent are skipped.
The rest of this page documents the injector as a library - the configuration, outcome, and fallback semantics below all apply to the production wiring above.
How it works
Latency budget
The injector enforces a 100ms hard budget for the entire lookup + ranking + Cedar filtering pipeline. If the budget is exceeded, handleVerbEvent returns { outcome: 'budget-exceeded' } and the agent's verb is never blocked. This ensures in-session memory is non-disruptive optional context.
const DEFAULT_IN_SESSION_CONFIG = {
latencyBudgetMs: 100,
minRelevanceScore: 0.4,
budgetTokens: 200,
maxSuggestionsPerEvent: 3,
skipTools: ['TodoWrite', 'BashOutput'],
}Configuration
interface InSessionMemoryConfig {
enabled: boolean
disabledForAgents?: string[] // per-agent opt-out
latencyBudgetMs?: number // default 100ms
minRelevanceScore?: number // default 0.4
budgetTokens?: number // default 200
workType?: string // for budget lookup, default 'feature'
maxSuggestionsPerEvent?: number // default 3
skipTools?: string[] // default ['TodoWrite', 'BashOutput']
}skipTools prevents high-frequency tools from overwhelming the agent with repeated suggestions. Add any write-heavy utility tools that your agents use constantly.
Creating an injector
import {
createInSessionMemoryInjector,
} from '@/lib/memory/in-session-injection'
const injector = createInSessionMemoryInjector({
store: myMemoryStore,
capability: {
supportsMessageInjection: true,
injectMessage: async (sessionId, text) => {
await myProvider.sendSystemMessage(sessionId, text)
},
},
config: {
latencyBudgetMs: 80,
maxSuggestionsPerEvent: 2,
},
})
// Subscribe to the verb bus
const sub = injector.subscribeToVerbBus(myVerbBus)
// Later, when the session ends:
injector.forgetSession(sessionId)
sub.unsubscribe()VerbBusEvent shape
interface VerbBusEvent {
phase: 'pre-verb' | 'post-verb'
sessionId: string
orgId: string
projectId?: string
agentId: string
tool: string // e.g. 'Edit', 'Read', 'Bash'
paths?: string[] // file paths the tool is touching
query?: string // free-form query if available
emittedAt: number // wall-clock ms
memoryScope?: 'project' | 'org' | 'session'
memoryNamespace?: string
}Injection outcomes
handleVerbEvent returns an InjectionResult:
outcome | Meaning |
|---|---|
injected | Block delivered directly to the agent via injectMessage |
queued | Provider lacks supportsMessageInjection; block queued for next turn |
no-match | No relevant observations found after filtering |
skipped | Tool in skipTools list |
budget-exceeded | Lookup did not complete within latencyBudgetMs |
disabled | Injector disabled globally or for this agent |
Next-turn fallback
When the execution provider does not support live message injection (supportsMessageInjection: false), observations are queued as PendingInjection entries. The session orchestrator drains this queue at the start of each new agent turn:
const pending = injector.drainPendingForNextTurn(sessionId)
for (const p of pending) {
systemMessages.push(p.block)
}Action-aware ranking
The injector issues two targeted queries per verb event: one anchored on the focal file path, one on the verb's query field (if present). Results are merged and then boosted for path overlap:
// boostByPathOverlap: +0.2 additive, capped at 1.0
// Checks metadata.paths[] first, falls back to content body substring match
const boosted = boostByPathOverlap(scored, event.paths[0])After path boosting, standard feedback-weighted ranking is applied with weightingEnabled: true.
Memory scope and namespace
Injection can be scoped beyond the default {orgId, projectId} retrieval scope:
memoryScope | Effect |
|---|---|
'project' | Default - retrieves from {orgId, projectId} |
'org' | Widens to all projects under the org |
'session' | Restricts to observations stamped with the current sessionId in metadata.sessionId |
memoryNamespace (when set) further restricts to observations whose metadata.namespace matches exactly.
Graph triplet integration
When a graph.matcher is provided, the injector also runs a graph query and filters the resulting triplets through the Cedar PEP (enforceGraphAccess) before including them in the block:
createInSessionMemoryInjector({
store: myMemoryStore,
capability: myCapability,
graph: {
matcher: myGraphMatcher,
identity: { orgId: 'org_123', agentId: 'agent_dev', projectId: 'proj_xyz' },
},
})Cross-org triplets require Cedar permit on both the subject and object node. Failure in the PEP defaults to deny (fail closed).
Durable inject queue (session-injects.ts)
At the library level the verb-bus injector is best-effort: if live injection is unavailable and the in-process pending queue never drains, observations can be lost across a worker restart. The durable inject queue backed by the session_memory_injects table closes that gap - and in production the injector's injectMessage capability is implemented as an enqueue onto this queue (see Production wiring).
The durable queue (session-injects.ts) is the production delivery rail for both memory paths today: the start-of-session block (context injection) and the in-session injector's matched blocks are each enqueued here and applied by the owning worker via handle.Inject.
Lifecycle
The queue follows a three-phase enqueue → claim → ack cycle:
- Enqueue is idempotent on
(session_id, content_hash)- the same observation block cannot be enqueued twice for the same session. Duplicate calls returnnullwithout inserting. - Claim keeps exactly one in-flight inject per session at all times. An unacked delivered row is re-delivered every heartbeat beat until the worker echoes its
deliveryIdback. - Ack marks the row as consumed, unblocking the next enqueued entry.
The lock-refresh gate ensures that a worker that lost session ownership never receives an inject - claim is tied to the current lock holder.
API
import {
enqueueMemoryInject,
claimPendingInjectForSession,
ackMemoryInject,
} from '@/lib/memory/session-injects'
// Enqueue a memory block for durable delivery
const row = await enqueueMemoryInject({
orgId: 'org_123',
sessionId: 'sess_abc',
agentId: 'agent_sdlc_dev', // optional
text: '## Relevant Observations\n- [obs_x] Auth middleware now requires...',
observationIds: ['obs_x'], // optional provenance
})
// row is null when the same text was already enqueued for this session
// On the lock-refresh path: claim the pending inject (if any)
const inject = await claimPendingInjectForSession('sess_abc')
if (inject) {
// send inject.text to the agent; track inject.deliveryId for ack
await sendToAgent(inject.text)
await ackMemoryInject('sess_abc', inject.deliveryId)
}Choosing between verb-bus and durable queue
| Criterion | Verb-bus injector | Durable queue |
|---|---|---|
| Delivery guarantee | Best-effort | At-least-once (re-delivered until acked) |
| Latency | Near-real-time (path-aware) | Next heartbeat beat |
| Provider requirement | supportsMessageInjection preferred | None (heartbeat-based) |
| Dedup scope | In-process session seen set | Database - cross-replica safe |
| Use case | Contextual mid-session hints | Critical observations, BFSI sessions |
Related pages
- Context Injection - start-of-session injection (complementary path)
- Feedback Retrieval - the
applyFeedbackRankingstep used here - Observation Store - the
IMemoryStorebacking both paths - Memory Diagnostics - session outcomes that drive the weight updates behind the retrieved observations