This page describes the engine's internal layout. Only the CLI surface (nci index, nci query, nci sql) is part of the public contract — everything below is pub(crate) and may move between releases. It is documented so contributors and curious users can map “what the CLI does” to a real module.
Crate layout
Module
Surface
What it does
cli.rs
public
The clap parser. Every flag the user can type is here.
config.rs
public
NciConfigFile (the on-disk schema), PackageScope, merge order helpers.
scanner.rs
internal
Walks install roots and returns one row per discovered package install.
Orchestrates the modules above. nci index calls in here.
cache.rs
internal
Per-package cache so re-indexing only touches changed packages.
High-level data flow
scanner
/filter
/parser
/crawler
/resolver
/dedupe
/graph
/storage
Parsing and crawling fan out per package; the resolver and dedupe stages run as the package's graph completes; storage commits in batches.
Why split it like this
Splitting parser from crawler keeps parsing pure — the crawler decides how far to walk based on --max-hops without re-parsing files.
Splitting resolver from dedupe lets merge_provenance_json carry exactly which mechanism merged a row, instead of mixing the two.
Splitting storage from everything else means the writer can run on its own thread; the rest of the pipeline does not block on disk.
Index concurrency (concurrency.rs)
Before the per-package loop, the pipeline picks a concurrency plan:
Default: one package at a time; multi-core work happens inside that package (parallel file reads and symbol linking when the package is large enough).
--package-parallel: several packages at once when more than one package is indexed; per-package multi-core work is turned off; finished packages wait in a bounded queue for the single SQLite writer.
nci thread-budget --package-count N prints that plan without indexing. See Indexing · Concurrency for a plain-language summary and how to read timing lines.
For the on-disk shape the storage stage produces, see SQLite schema. For how the resolver decides what is “the same” symbol, see Re-export resolution.