Tech Stack Fingerprinting
Tech-stack fingerprinting.
Tech-stack fingerprinting produces a normalized set of component IDs from a project's manifest files. Two fingerprints can be compared by Jaccard similarity to decide whether cross-project knowledge transfer is eligible.
This is the mechanism behind Rensei's cross-project learning: before the platform copies architectural observations from one project to another, it checks whether both projects share enough stack overlap to make that knowledge relevant.
How fingerprinting works
Component ID format
Each recognized dependency becomes one or two component IDs in the fingerprint set:
language:typescript
language:go@1.22
framework:nextjs
framework:nextjs@16
lib:react
lib:react@19
tool:tailwindThe version-qualified form (framework:nextjs@16) is emitted in addition to the bare form when a major version can be extracted from the manifest's version range string ("^16.1.0" → major 16).
The fingerprinter only emits components it can confidently identify. It never inspects file contents beyond manifests, never executes code, and has zero external dependencies.
Supported manifest files
| Manifest | Language detected | Sample components emitted |
|---|---|---|
package.json | language:javascript, language:typescript (if typescript dep present) | framework:nextjs, lib:react, tool:tailwind, lib:drizzle |
go.mod | language:go, language:go@<major>.<minor> | framework:gin, framework:echo, lib:postgres-client, lib:redis-client |
pyproject.toml | language:python | framework:django, framework:fastapi, lib:sqlalchemy, tool:pytest |
requirements.txt | language:python | Same as pyproject.toml |
Cargo.toml | language:rust | framework:axum, framework:actix-web, lib:tokio, lib:serde, lib:sqlx |
Gemfile | language:ruby | framework:rails, framework:sinatra, lib:postgres-client, tool:rspec |
Multiple manifests can be passed simultaneously for polyglot projects - the fingerprint is the union of all recognized components across all parsed manifests.
API
fingerprintProject(files)
import { fingerprintProject } from '@/lib/memory/tech-stack'
const fp = fingerprintProject({
packageJson: '{"dependencies":{"next":"^16.0.0","react":"^19.0.0","typescript":"^5.0.0"}}',
goMod: 'module example.com/app\ngo 1.22\nrequire github.com/gin-gonic/gin v1.9.1',
})
// fp.components: Set {
// 'language:javascript',
// 'language:typescript',
// 'framework:nextjs',
// 'framework:nextjs@16',
// 'lib:react',
// 'lib:react@19',
// 'language:go',
// 'language:go@1.22',
// 'framework:gin',
// }stackSimilarity(a, b)
import { fingerprintProject, stackSimilarity } from '@/lib/memory/tech-stack'
const a = fingerprintProject({ packageJson: '{"dependencies":{"next":"^16","react":"^19"}}' })
const b = fingerprintProject({ packageJson: '{"dependencies":{"next":"^16","react":"^19","drizzle-orm":"^0.45"}}' })
const sim = stackSimilarity(a, b)
// sim ≈ 0.67 (2 shared out of 3+1 unique = 4 union items: js, next, next@16, react, react@19, drizzle ...)
// Exact value depends on component listReturns a Jaccard coefficient:
|A ∩ B| / |A ∪ B|Returns 0 when both fingerprints are empty (avoids divide-by-zero).
isTransferCandidate(a, b, threshold?)
import { isTransferCandidate } from '@/lib/memory/tech-stack'
const eligible = isTransferCandidate(fpProjectA, fpProjectB)
// Default threshold: 0.7 (≥ 70% Jaccard overlap)
// Eligibility is necessary but NOT sufficient - Cedar policy must still authorize the transfer.The default threshold of 0.7 (70% overlap) is the documented design target. Passing a threshold overrides it per call.
Types
interface TechStackFingerprint {
components: Set<string> // Deterministic, sorted on iteration
}
interface ProjectManifestFiles {
packageJson?: string // Raw string content - not a parsed object
goMod?: string
pyprojectToml?: string
cargoToml?: string
requirementsTxt?: string
gemfile?: string
}Design notes
The fingerprinter is intentionally conservative:
- High precision over recall. It only recognizes a curated list of high-signal frameworks and runtimes. Transitive dependencies are not captured - Jaccard similarity degrades badly when dominated by noise.
- Fail-open on parse errors. Malformed manifests contribute nothing rather than throwing. Partial fingerprints are better than pipeline crashes.
- No I/O, no side effects. The caller supplies raw strings.
fingerprintProjectis a pure function. - Cedar policy is the enforcement gate.
isTransferCandidateis a necessary precondition, not a sufficient one. The cross-project transfer Cedar policy is evaluated separately before any observations are copied.
Related pages
- Cross-Project Knowledge Transfer - how fingerprints gate Cedar-authorized observation sharing
- Arch Query Layer - how architectural views are queried per project
- BFSI Data Classification - observation classification that controls what can be transferred