Knowledge Graph Store
pgvector knowledge graph.
Infrastructure - dormant in production. The knowledge graph store is built and schema-migrated, but extraction is gated behind DB availability and LLM credential checks. The platform-native arch read path (src/lib/arch/query.ts) reads this store directly - patterns, conventions, decisions, and deviations from graph_nodes - but drift assessment (the writer that emits deviation nodes) is not yet continuously populating it. Treat this as a substrate you can observe and extend, not a production-ready feature.
This page covers the Rensei platform's storage layer. The platform's architectural-intelligence read path reads this store directly (no SDK) - see the Arch Query Layer. The OSS architectural-intelligence contracts and domain model are documented at donmai.dev/docs/architectural-intelligence; intelligence itself is platform-implemented (ADR-2026-06-07), so the platform does not consume the OSS SDK against this store.
The knowledge graph store (PgGraphStore) is a PostgreSQL + pgvector substrate that persists architectural entities as a labeled property graph. Nodes represent code concepts; edges carry typed relationships. The store drives context injection, hybrid search, and the feedback learning loop.
Architecture
Tables
graph_nodes
Nodes represent discrete architectural entities. Key columns:
| Column | Type | Notes |
|---|---|---|
id | uuid | UUID5 deterministic from (type, name) - same entity always same id |
name | text | Display name |
type | text | One of Service, Module, API, Database, Decision, Pattern, Person, Config, Dependency, deviation |
description | text? | LLM-generated or author-supplied |
embedding_vector | vector? | pgvector column, HNSW-indexed |
feedback_weight | float | EMA-updated quality signal (default 0.5) |
importance_weight | float | Extraction-time importance (default 0.5) |
org_id | uuid | Tenant boundary - RLS enforced via app.current_org_id GUC |
project_id | uuid | Sub-tenant scope |
source_observation_id | uuid? | Back-link to the observation that produced this node |
source_session_id | uuid? | Session provenance |
graph_edges
Edges are typed relationships between nodes. Primary key is (source_id, target_id, relationship_name).
| Column | Type | Notes |
|---|---|---|
weight | float | Structural weight (default 1.0) |
feedback_weight | float | EMA-updated (default 0.5) |
graph_triplet_embeddings
Dedicated pgvector table for triplet-level semantic search. Stores the embedding of "sourceName → relationshipName → targetName". Migration 0057_triplet_embeddings_pgvector.sql backfills any older inline JSONB vectors.
IGraphStore API
Both PgGraphStore (production) and InMemoryGraphStore (test double) implement IGraphStore:
interface IGraphStore {
upsertNode(node: GraphNodeInput): Promise<GraphNodeRow>
upsertEdge(edge: GraphEdgeInput): Promise<GraphEdgeRow>
upsertNodes(nodes: GraphNodeInput[]): Promise<GraphNodeRow[]>
upsertEdges(edges: GraphEdgeInput[]): Promise<GraphEdgeRow[]>
getNode(id: string): Promise<GraphNodeRow | null>
getEdges(filter: GetEdgesFilter): Promise<GraphEdgeRow[]>
// pgvector cosine similarity (threshold default 0.92)
findFuzzyDuplicates(
name: string,
embedding: number[],
threshold?: number,
scope?: { orgId: string; projectId: string },
): Promise<FuzzyDuplicateResult[]>
// Top-K vector search (lower distance = more relevant)
searchByEmbedding(
queryEmbedding: number[],
topK: number,
scope: { orgId: string; projectId?: string },
): Promise<EmbeddingSearchResult[]>
deleteNode(id: string): Promise<void>
// Triplet-level embeddings
upsertTripletEmbedding(input: TripletEmbeddingInput): Promise<void>
findFuzzyTripletDuplicates(
embedding: number[],
threshold: number,
scope: { orgId: string; projectId: string },
topK?: number,
): Promise<TripletSearchResult[]>
// Extraction-cron skip-already-extracted helper
observationIdsWithNodes(observationIds: string[]): Promise<Set<string>>
}Tenant isolation
PgGraphStore is instantiated unscoped and then bound to a tenant via .withScope({ orgId, projectId }). Every method runs inside a scoped transaction that sets app.current_org_id (and app.current_project_id) via set_config(..., true) (transaction-local GUC). The production Neon role is BYPASSRLS, so the RLS policies on graph_* tables are inert at runtime - the explicit WHERE org_id = $1 filter in each read method is the authoritative tenant boundary. The GUC + RLS serve as belt-and-suspenders for future role changes and test fixtures that run under NOBYPASSRLS roles.
const store = new PgGraphStore(db)
const scoped = store.withScope({ orgId: 'org_...', projectId: 'proj_...' })
await scoped.upsertNode({ ... })Per-project enablement
The graph engine is gated per project through the agent-memory control plane (project_memory_config), not by an org-wide flag. A project gets graph context, recall, and MCP graph tools only when its memory is enabled and graph is on for that project. The helper is isGraphEnabledForProject(orgId, projectId).
"Graph on" uses OR semantics: it is true when the project has an explicit project_memory_config.graph_enabled = true, or the org's plan grants the kg_enabled feature flag (plan-assignable via planDefinitions).
The org-level graph_engine_enabled flag no longer exists. Enablement is the per-project gate above. Be aware of the tradeoff: a project can opt in explicitly via its toggle, but it cannot opt out while its plan grants kg_enabled - the plan entitlement wins under OR semantics.
Every live consumer gates the same way through isGraphEnabledForProject: the MCP graph tools, context injection at session start, recall, and the extraction cron (which enumerates KG-enabled (org, project) pairs).
The per-project toggle lives in the tenant Memory config (the project's performance/settings page) and is written through PUT /api/projects/[projectId]/memory-config.
Per-work-type recall - which agent work types receive graph-augmented context - is no longer static code. It is an operator-configured recall matrix (with per-org overrides) managed in the memory admin console (/admin/memory, operator-only).
Migrations
| Migration | Purpose |
|---|---|
0048_daffy_talkback.sql | Initial graph_nodes, graph_edges tables + RLS policies |
0049_nappy_killmonger.sql | graph_feedback_history table + session/node/org-project indexes |
0057_triplet_embeddings_pgvector.sql | graph_triplet_embeddings table + HNSW index; backfills inline JSONB vectors |
Related pages
- Extraction Pipeline - how observations become nodes
- Entity Resolution - UUID5 dedup + fuzzy merge
- Hybrid Search - vector + k-hop traversal
- Context Injection - triplets into agent prompts
- Graph Feedback - EMA weight updates
- MCP Tools - agent-callable graph tools