Rensei docs
Arch

AST Extraction

AST dependency extraction.

AST-driven dependency extraction converts TypeScript and JavaScript source files into knowledge graph nodes and edges. When an agent writes a file_operation observation, this strategy uses the TypeScript compiler API to parse the source AST and emit one Module node per imported specifier plus a depends_on edge back to the source file.

This is the highest-precision extraction strategy - it produces deterministic, zero-LLM-cost output that can handle every valid import and re-export syntax variant.

How it works

Import kinds

Each import is tagged with an ImportKind that describes how the specifier is referenced:

KindTypeScript syntaxExample
valueRegular import, namespace importimport { foo } from 'bar', import * as N from 'bar'
typeType-only import or re-exportimport type { T } from 'bar', export type { T } from 'bar'
dynamicDynamic import() expressionconst m = await import('./foo')
side-effectImport with no clauseimport './styles.css'
requireCJS require() or import x = require()const x = require('bar'), import x = require('bar')

When the same specifier appears under multiple kinds (for example, a file is both import type and import in different statements), the highest-priority kind wins: value > require > dynamic > side-effect > type. This ensures runtime dependencies are represented accurately even in mixed-style codebases.

Node and edge shape

The strategy emits a Module node for the source file and one Module node per unique imported specifier, with a depends_on edge from source to each dependency:

{
  "nodes": [
    {
      "id": "module-src-lib-auth-ts",
      "name": "src/lib/auth.ts",
      "type": "Module",
      "description": "Module at src/lib/auth.ts"
    },
    {
      "id": "module-next-server",
      "name": "next/server",
      "type": "Module",
      "description": "Imported module: next/server (kind: value)"
    },
    {
      "id": "module---lib-db",
      "name": "@/lib/db",
      "type": "Module",
      "description": "Imported module: @/lib/db (kind: value)"
    }
  ],
  "edges": [
    {
      "sourceNodeId": "module-src-lib-auth-ts",
      "targetNodeId": "module-next-server",
      "relationshipName": "depends_on"
    },
    {
      "sourceNodeId": "module-src-lib-auth-ts",
      "targetNodeId": "module---lib-db",
      "relationshipName": "depends_on"
    }
  ]
}

The import kind is embedded in the dependency node's description field. The relationshipName is always depends_on - this preserves backward compatibility with graph queries that rely on that edge type.

File extension handling

The parser picks the correct TypeScript ScriptKind based on the file extension in observation.metadata.filePath:

ExtensionScriptKind
.tsxTSX
.jsxJSX
.js, .mjs, .cjsJS
.jsonJSON
everything else (.ts, etc.)TS

If no filePath is available in the observation metadata, a synthetic path file-<id> is used and the parser defaults to TS mode.

Using extractFromFileOperation

The function is exported from strategies.ts and called by the extraction pipeline's strategy dispatcher:

import { extractFromFileOperation } from '@/lib/graph/extraction/strategies'

const graph = extractFromFileOperation({
  id: 'obs_123',
  type: 'file_operation',
  content: `
    import { NextRequest } from 'next/server'
    import type { Db } from '@/lib/db'
    import './polyfill'
    const Redis = require('ioredis')
  `,
  metadata: { filePath: 'src/lib/handler.ts' },
})

// graph.nodes: [ { id: 'module-src-lib-handler-ts', ... }, { id: 'module-next-server', ... }, ... ]
// graph.edges: [ { sourceNodeId: 'module-src-lib-handler-ts', targetNodeId: 'module-next-server', ... }, ... ]

Parse failures return { nodes: [], edges: [] } and log a warning. The pipeline degrades gracefully - a failed extraction does not block the pipeline.

Other extraction strategies

The runStrategy dispatcher in strategies.ts routes observations to the appropriate strategy by type:

Observation typeStrategyLLM call?
file_operationextractFromFileOperation (AST)No
decisionextractFromDecisionYes
errorextractFromErrorYes
session_summaryextractFromSessionSummaryYes
explicitextractFromExplicitYes
(unknown)Falls through to extractFromExplicitYes

The AST strategy is the only deterministic, zero-cost strategy. All others issue an LLM inference call against the configured judge model and require LLM_API_KEY to be set.

LLM strategy prompts

For the LLM-backed strategies, the prompt instructs the model to return a JSON graph with a fixed schema:

{
  "nodes": [
    { "id": "<slug>", "name": "<name>", "type": "<Service|Module|API|Database|Decision|Pattern|Person|Config|Dependency>", "description": "<text>" }
  ],
  "edges": [
    { "sourceNodeId": "<id>", "targetNodeId": "<id>", "relationshipName": "<snake_case>" }
  ]
}

Each LLM strategy specializes the prompt for its observation type - decision extracts Decision + Person nodes with decided_by edges; error extracts anti-pattern Pattern nodes with workaround_for edges.

On this page