Knowledge Graph Store

Infrastructure - dormant in production. The knowledge graph store is built and schema-migrated, but extraction is gated behind DB availability and LLM credential checks. The platform-native arch read path (src/lib/arch/query.ts) reads this store directly - patterns, conventions, decisions, and deviations from graph_nodes - but drift assessment (the writer that emits deviation nodes) is not yet continuously populating it. Treat this as a substrate you can observe and extend, not a production-ready feature.

This page covers the Rensei platform's storage layer. The platform's architectural-intelligence read path reads this store directly (no SDK) - see the Arch Query Layer. The OSS architectural-intelligence contracts and domain model are documented at donmai.dev/docs/architectural-intelligence; intelligence itself is platform-implemented (ADR-2026-06-07), so the platform does not consume the OSS SDK against this store.

The knowledge graph store (PgGraphStore) is a PostgreSQL + pgvector substrate that persists architectural entities as a labeled property graph. Nodes represent code concepts; edges carry typed relationships. The store drives context injection, hybrid search, and the feedback learning loop.

Architecture

Tables

`graph_nodes`

Nodes represent discrete architectural entities. Key columns:

Column	Type	Notes
`id`	`uuid`	UUID5 deterministic from `(type, name)` - same entity always same id
`name`	`text`	Display name
`type`	`text`	One of `Service`, `Module`, `API`, `Database`, `Decision`, `Pattern`, `Person`, `Config`, `Dependency`, `deviation`
`description`	`text?`	LLM-generated or author-supplied
`embedding_vector`	`vector?`	pgvector column, HNSW-indexed
`feedback_weight`	`float`	EMA-updated quality signal (default `0.5`)
`importance_weight`	`float`	Extraction-time importance (default `0.5`)
`org_id`	`uuid`	Tenant boundary - RLS enforced via `app.current_org_id` GUC
`project_id`	`uuid`	Sub-tenant scope
`source_observation_id`	`uuid?`	Back-link to the observation that produced this node
`source_session_id`	`uuid?`	Session provenance

`graph_edges`

Edges are typed relationships between nodes. Primary key is (source_id, target_id, relationship_name).

Column	Type	Notes
`weight`	`float`	Structural weight (default `1.0`)
`feedback_weight`	`float`	EMA-updated (default `0.5`)

`graph_triplet_embeddings`

Dedicated pgvector table for triplet-level semantic search. Stores the embedding of "sourceName → relationshipName → targetName". Migration 0057_triplet_embeddings_pgvector.sql backfills any older inline JSONB vectors.

`IGraphStore` API

Both PgGraphStore (production) and InMemoryGraphStore (test double) implement IGraphStore:

interface IGraphStore {
  upsertNode(node: GraphNodeInput): Promise<GraphNodeRow>
  upsertEdge(edge: GraphEdgeInput): Promise<GraphEdgeRow>
  upsertNodes(nodes: GraphNodeInput[]): Promise<GraphNodeRow[]>
  upsertEdges(edges: GraphEdgeInput[]): Promise<GraphEdgeRow[]>
  getNode(id: string): Promise<GraphNodeRow | null>
  getEdges(filter: GetEdgesFilter): Promise<GraphEdgeRow[]>

  // pgvector cosine similarity (threshold default 0.92)
  findFuzzyDuplicates(
    name: string,
    embedding: number[],
    threshold?: number,
    scope?: { orgId: string; projectId: string },
  ): Promise<FuzzyDuplicateResult[]>

  // Top-K vector search (lower distance = more relevant)
  searchByEmbedding(
    queryEmbedding: number[],
    topK: number,
    scope: { orgId: string; projectId?: string },
  ): Promise<EmbeddingSearchResult[]>

  deleteNode(id: string): Promise<void>

  // Triplet-level embeddings
  upsertTripletEmbedding(input: TripletEmbeddingInput): Promise<void>
  findFuzzyTripletDuplicates(
    embedding: number[],
    threshold: number,
    scope: { orgId: string; projectId: string },
    topK?: number,
  ): Promise<TripletSearchResult[]>

  // Extraction-cron skip-already-extracted helper
  observationIdsWithNodes(observationIds: string[]): Promise<Set<string>>
}

Tenant isolation

PgGraphStore is instantiated unscoped and then bound to a tenant via .withScope({ orgId, projectId }). Every method runs inside a scoped transaction that sets app.current_org_id (and app.current_project_id) via set_config(..., true) (transaction-local GUC). The production Neon role is BYPASSRLS, so the RLS policies on graph_* tables are inert at runtime - the explicit WHERE org_id = $1 filter in each read method is the authoritative tenant boundary. The GUC + RLS serve as belt-and-suspenders for future role changes and test fixtures that run under NOBYPASSRLS roles.

const store = new PgGraphStore(db)
const scoped = store.withScope({ orgId: 'org_...', projectId: 'proj_...' })
await scoped.upsertNode({ ... })

Per-project enablement

The graph engine is gated per project through the agent-memory control plane (project_memory_config), not by an org-wide flag. A project gets graph context, recall, and MCP graph tools only when its memory is enabled and graph is on for that project. The helper is isGraphEnabledForProject(orgId, projectId).

"Graph on" uses OR semantics: it is true when the project has an explicit project_memory_config.graph_enabled = true, or the org's plan grants the kg_enabled feature flag (plan-assignable via planDefinitions).

The org-level graph_engine_enabled flag no longer exists. Enablement is the per-project gate above. Be aware of the tradeoff: a project can opt in explicitly via its toggle, but it cannot opt out while its plan grants kg_enabled - the plan entitlement wins under OR semantics.

Every live consumer gates the same way through isGraphEnabledForProject: the MCP graph tools, context injection at session start, recall, and the extraction cron (which enumerates KG-enabled (org, project) pairs).

The per-project toggle lives in the tenant Memory config (the project's performance/settings page) and is written through PUT /api/projects/[projectId]/memory-config.

Per-work-type recall - which agent work types receive graph-augmented context - is no longer static code. It is an operator-configured recall matrix (with per-org overrides) managed in the memory admin console (/admin/memory, operator-only).

Migrations

Migration	Purpose
`0048_daffy_talkback.sql`	Initial `graph_nodes`, `graph_edges` tables + RLS policies
`0049_nappy_killmonger.sql`	`graph_feedback_history` table + session/node/org-project indexes
`0057_triplet_embeddings_pgvector.sql`	`graph_triplet_embeddings` table + HNSW index; backfills inline JSONB vectors

Extraction Pipeline - how observations become nodes
Entity Resolution - UUID5 dedup + fuzzy merge
Hybrid Search - vector + k-hop traversal
Context Injection - triplets into agent prompts
Graph Feedback - EMA weight updates
MCP Tools - agent-callable graph tools

On this page