NCI
Architecture · Pipeline

Pipeline internals

This page describes the engine's internal layout. Only the CLI surface (nci index, nci query, nci sql) is part of the public contract — everything below is pub(crate) and may move between releases. It is documented so contributors and curious users can map “what the CLI does” to a real module.

Crate layout

ModuleSurfaceWhat it does
cli.rspublicThe clap parser. Every flag the user can type is here.
config.rspublicNciConfigFile (the on-disk schema), PackageScope, merge order helpers.
scanner.rsinternalWalks install roots and returns one row per discovered package install.
filter.rsinternalApplies package_scope, packages.include, packages.exclude.
parser.rsinternalReads a .d.ts, emits parsed declarations and imports.
crawler.rsinternalWalks the per-package module graph from each entry.
resolver.rsinternalResolves re-exports and dependency edges (nci-dep-v1) to terminal declarations.
graph.rsinternalOwns the in-memory symbol graph between resolve and store.
dedupe.rsinternalIdentical-fold and overload-key merging — produces merge_provenance_json.
storage.rsinternalBulk SQLite writes, FTS sync, schema migrations, per-package cache keys.
pipeline.rsinternalOrchestrates the modules above. nci index calls in here.
cache.rsinternalPer-package cache so re-indexing only touches changed packages.

High-level data flow

  • scanner
  • filter
  • parser
  • crawler
  • resolver
  • dedupe
  • graph
  • storage
Parsing and crawling fan out per package; the resolver and dedupe stages run as the package's graph completes; storage commits in batches.

Why split it like this

  • Splitting parser from crawler keeps parsing pure — the crawler decides how far to walk based on --max-hops without re-parsing files.
  • Splitting resolver from dedupe lets merge_provenance_json carry exactly which mechanism merged a row, instead of mixing the two.
  • Splitting storage from everything else means the writer can run on its own thread; the rest of the pipeline does not block on disk.

For the on-disk shape the storage stage produces, see SQLite schema. For how the resolver decides what is “the same” symbol, see Re-export resolution.