What nci index actually does
Indexing is one fixed pipeline driven by nci index. Stage names below match the engine’s internal modules (scanner, filter, parser, crawler, resolver, dedupe, storage). The CLI is the public surface — the modules are documented for orientation, not as configuration knobs.
The shape
A grep over node_modules returns lines. NCI returns symbols — merged graph rows with signature, js_doc, kind_name, parent_symbol_id, source_package_name, is_internal, plus a stable id. That difference is what makes the index queryable in milliseconds and what lets agents call nci query without re-parsing.
Scanner — discover install roots
project_root plus each workspaces entry — every directory that contributes a node_modules.
One package row per discovered install: name, version, install path, source dir, manifest path.
The configuration that drives this stage is project_root, workspaces, and index_root_workspace in nci.config.json — overridable with -r, --skip-root-workspace, and --include-root-workspace.
Filter — apply package_scope
Every discovered install.
The packages whose names appear in the union of every consumer manifest’s dependencies / dev_dependencies — or every install when package_scope is "all_installed".
--package-scope dependencies,dev-dependencies matches package_scope: ["dependencies", "dev_dependencies"]. packages.include / packages.exclude globs filter on top.
Parser + Crawler — read .d.ts
Each filtered package’s entry .d.ts resolved from exports / types / typings, falling back to index.d.ts.
Parsed declarations + parsed imports for every reached file.
Parsing produces declarations; the crawler decides which other files to fetch. Splitting them lets --max-hops change without re-parsing already-loaded files.
Resolver — promote re-exports
The parsed declarations plus the per-package module graph.
Every public symbol promoted to its terminal declaration. import X from "./x"; export { X } becomes X, never a placeholder. Dependency edges follow nci-dep-v1.
Result10 → default; covers every common library, including barrel-heavy ones.
Dedupe + Storage — write to SQLite
Resolved declarations from every package.
Rows in packages, symbols, symbol_dependencies, plus FTS5 (symbols_fts). Each symbols row gets signature, kind_name, parent_symbol_id, and either fused or distinct overloads, recorded in merge_provenance_json.
Everything lands in one nci.sqlite file under the OS cache dir (override with NCI_CACHE_DIR). nci query and nci sql both read this file with no further parsing — that is why responses are sub-millisecond once the index exists.
Re-running
--dry-run walks scan + filter only and stops. Use it as a CI sanity gate: if the count drops, your package_scope or filters regressed.
With progress on (see Configuration), each indexed package prints in … — per-package wall time from start through persist (queue wait included). Use --index-timing-detail to add crawl / wait / save splits. The closing summary includes wall for the whole nci index run.
Packages whose index_cache_key still matches are skipped (no re-crawl). Pending symbol backfills run on the writer connection before those cache probes.
--package and --dependency-stub-package accept globs or package names; each flag is repeatable or comma-separated (-p a,b, -s zod,@scope/pkg). Combine them to skip walking into noisy SDKs — every consumer import of those packages collapses into an npm::… stub edge.
Concurrency (default)
Plain nci index works on one package at a time. While it is on that package, it can still use several CPU cores to read .d.ts files in parallel (when a discovery batch has at least two files) and to connect related symbols on large packages (the symbol-linking cutoff scales with your CPU count).
For large monorepos you can enable package-parallel indexing once in nci.config.json ("package_parallel": true) or per run with nci index --package-parallel (CLI flag wins when set). Scan dedupe keeps one row per name@version so cache and crawl stay aligned.
| What you run | Packages at once | Extra cores inside each package |
|---|---|---|
nci index (default) | 1 | Yes (on large packages) |
nci index --package-parallel or config package_parallel: true | Several (up to your CPU budget) | No |
Default indexing still uses a bounded save queue sized to your thread budget (e.g. 8 slots on an 8-core machine) so crawl can stay ahead of SQLite writes without unbounded memory. With --package-parallel, that queue is twice as deep (e.g. 16 vs 8) because several packages can finish at once.
--package-parallel applies when you pass that flag and you are indexing more than one package. A single-package run stays one-at-a-time even with the flag.
Batch crawl dedupe (default)
When nci index indexes several packages in one run, crawl does not walk into another indexed package’s install on a consumer dependency row. Those edges become versioned stub ids:
npm::<specifier>@<version>::<member>
Example: npm::@my-org/shared@1.2.0::SharedConfig. The provider package in the batch keeps the full parsed surface. Follow the stub with nci query evidence --package <specifier> --package-version <version> on that row.
Monorepo runs use this by default: workspace packages indexed together stub each other. You do not need @my-org/shared in dependency_stub_packages only because apps import it.
That is separate from dependency_stub_packages / -s, which emit unversioned npm::<specifier>::<member> stubs (no @<version> segment). Use manual stubs for named packages only on npm or when you want to skip crawling entirely. See Monorepo guide · Stub noisy dependencies.
Use --no-batch-index-crawl-dedupe to restore pre-dedupe crawl (larger consumer graphs, symbols inlined from sibling installs).
Preview before a long index — read the summary line first; the other fields are for advanced debugging.
On a typical 8-core machine, 104 packages with the default (no flag) stays one package at a time with multi-core work inside each large package. The same machine with 6 packages and --package-parallel runs several packages at once (up to your CPU budget) and turns off multi-core work inside each package — compare both commands above to see the plan before you index.