CLI Reference¶
Consist provides command-line tools to inspect, query, and compare runs and artifacts without writing Python code. Use it to answer “what ran, with what inputs, and what changed?” directly from your provenance database.
This is especially useful when you are SSH’d into a remote server or working in a headless environment: you can quickly explore runs, artifacts, and lineage from the shell without starting a Python session.
In particular, the consist shell command creates a persistent Tracker object linked to a database and allows you to query run inputs and outputs and artifact metadata, producing nicely formatted tables and summaries in the terminal.
Database Discovery¶
The CLI looks for the provenance database in this order:
1. Explicit --db-path argument
2. CONSIST_DB environment variable
3. provenance.duckdb in the current directory
4. Common subdirectories (./data/, ./db/, ./.consist/)
Artifact Path Resolution¶
Some artifacts are stored with relative URIs like ./outputs/.... For commands that load
or validate artifact files (preview, validate, and shell preview/schema_profile),
Consist resolves relative paths in this order:
- Explicit
--run-dir - Parent directory of
--db-path - Current working directory
_physical_run_dirfrom run metadata (only when--trust-dbis enabled)
This keeps default behavior safe while still supporting archived/moved run directories.
Examples:
# Typical case: db and outputs are colocated (auto-discovery via --db-path parent)
consist preview trip_table --db-path examples/runs/beam_core_demo/beam_core_demo_demo.duckdb
# Archived/moved run root: explicitly point to the new location
consist preview trip_table \
--db-path archives/beam_core_demo.duckdb \
--run-dir /data/archive/beam_core_demo
# Allow fallback to _physical_run_dir metadata from the DB
consist preview trip_table --db-path archives/beam_core_demo.duckdb --trust-db
Global Options¶
Shell completion helpers:
consist --install-completion # Install for current shell
consist --show-completion # Print completion script
Commands¶
consist db¶
Database maintenance and recovery commands. See the dedicated guide: DB Maintenance Guide.
consist db inspect
consist db doctor
consist db snapshot --out snapshot.duckdb
consist db rebuild --json-dir ./consist_runs --mode minimal
consist db compact
consist db export <run_id> --out shard.duckdb
consist db merge shard.duckdb --conflict error
consist db purge <run_id> --dry-run
consist db fix-status <run_id> completed --reason "manual correction"
Operational recipes:
# 1) Health checks (inspect + doctor)
consist db inspect --db-path ./provenance.duckdb
consist db doctor --db-path ./provenance.duckdb
# 2) Safe purge preview + execute (with optional cache pruning)
consist db purge RUN_ID --dry-run --db-path ./provenance.duckdb
consist db purge RUN_ID --delete-ingested-data --prune-cache --yes --db-path ./provenance.duckdb
# 3) Rebuild from JSON snapshots (minimal vs full)
consist db rebuild --json-dir ./consist_runs --mode minimal --db-path ./provenance.duckdb
consist db rebuild --json-dir ./consist_runs --mode full --db-path ./provenance.duckdb
# 4) Merge conflict handling (error|skip)
consist db merge shard.duckdb --conflict error --json --db-path ./provenance.duckdb
consist db merge shard.duckdb --conflict skip --db-path ./provenance.duckdb
Notes:
- Merge conflict mode values are error and skip.
- --prune-cache only applies when --delete-ingested-data is enabled.
consist runs¶
List recent runs with optional filters.
consist runs # Last 10 runs
consist runs --limit 20 # Last 20 runs
consist runs --model travel_demand # Filter by model name
consist runs --status completed # Filter by status
consist runs --tag simulation # Filter by tag
consist runs --json # JSON output for scripting
consist show¶
Display detailed information about a specific run.
Shows run metadata, configuration, status, duration, and any custom metadata fields.
consist artifacts¶
Inspect artifacts in two modes:
- Run mode: list input/output artifacts for one run.
- Query mode: search artifacts by indexed facet predicates.
consist artifacts <run_id>
# Query by artifact facet params (repeat --param)
consist artifacts --param beam.phys_sim_iteration=2
consist artifacts --param beam.year>=2030 --param beam.year<=2035
# Optional query filters
consist artifacts --param beam.phys_sim_iteration=2 --namespace beam
consist artifacts --param beam.phys_sim_iteration=2 --key-prefix linkstats
consist artifacts --param beam.phys_sim_iteration=2 --family-prefix linkstats_unmodified
consist artifacts --param beam.phys_sim_iteration=2 --limit 500
Query-mode options:
--param: facet predicate (key=value,key>=value,key<=value)--namespace: default namespace when predicates omit one--key-prefix: prefix filter on artifact key--family-prefix: prefix filter on indexedartifact_familyfacet--limit: maximum results (default100)
consist schema capture-file¶
Capture file schema metadata for an already-logged artifact so schema export
and shell schema_stub can generate SQLModel stubs.
# Capture by artifact key
consist schema capture-file --artifact-key trip_table
# Capture by UUID
consist schema capture-file --artifact-id 00000000-0000-0000-0000-000000000000
# Run-scoped key lookup + explicit path override
consist schema capture-file \
--artifact-key trip_table \
--run-id beam_2025_iter2 \
--path /data/beam/outputs/trip_table.parquet
Options:
--sample-rows N: rows to sample during inference (default1000)--if-changed: reuse a prior schema observation when the artifact hash is unchanged; this affects schema capture only, not artifact-row reuse--trust-db: allow metadata-based mount inference when resolving paths--db-path: explicit DB path
consist views create¶
Create a grouped hybrid view from schema identity + facet/run filters.
consist views create v_linkstats_all \
--schema-id <schema_hash> \
--namespace beam \
--param artifact_family=linkstats_unmodified_phys_sim_iter_parquet \
--param year=2018 \
--attach-facet artifact_family \
--attach-facet year \
--attach-facet phys_sim_iteration \
--driver parquet
Common options:
--mode hybrid|hot_only|cold_only--if-exists replace|error--missing-files warn|error|skip_silent--schema-compatible- run filters:
--run-id,--parent-run-id,--model,--status,--year,--iteration
consist lineage¶
Trace the full provenance chain for an artifact.
Displays a tree showing which runs and inputs produced the artifact.
consist scenarios¶
List all scenarios and their run counts.
consist scenario¶
Show all runs within a specific scenario.
consist search¶
Search runs by ID, model name, or scenario.
consist summary¶
Display database statistics: total runs, artifacts, date range, and runs per model.
consist preview¶
Preview tabular artifacts (CSV, Parquet) without loading full data.
consist preview <artifact_key>
consist preview <artifact_key> --rows 10
consist preview --hash a1b2c3d4
consist preview <artifact_key> --run-dir /path/to/run_root
consist preview <artifact_key> --trust-db # Allow metadata-based mount/run-dir inference
Hash lookup searches all artifacts by default. In shell workflows, use
schema_stub --run-id <run_id> --hash <prefix> when you want to narrow selection
to one run.
consist validate¶
Check that artifacts in the database exist on disk.
consist validate
consist validate --fix # Mark missing artifacts
consist validate --db-path examples/runs/beam_core_demo/beam_core_demo_demo.duckdb
consist validate --db-path archives/beam_core_demo.duckdb --run-dir /data/archive/beam_core_demo
consist validate --db-path archives/beam_core_demo.duckdb --trust-db
consist shell¶
Start an interactive session for exploring provenance.
consist shell
consist shell --db-path examples/runs/beam_core_demo/beam_core_demo_demo.duckdb
consist shell --run-dir /data/archive/beam_core_demo
consist shell --trust-db # Allow metadata-based mount/run-dir inference
Inside the shell:
(consist) runs --limit 5
(consist) show #1
(consist) artifacts #1
(consist) preview @1
(consist) schema_profile @1
(consist) schema capture-file @1
(consist) schema_stub --run-id abc123 --hash a1b2c3d4
(consist) preview --hash a1b2c3d4
(consist) context
(consist) scenarios
(consist) exit
Useful shell shortcuts:
#<n>refers to the nth cached run from the lastrunsoutput.@<n>refers to the nth cached artifact from the lastartifacts <run_id>output.schema capture-file @<n>andschema capture-file --artifact-ref @<n>reuse cached shell artifact refs and route them to--artifact-id.contextprints the active shell defaults fordb_path,run_dir,trust_db, and mount overrides.preview --hash <prefix>andschema_profile --hash <prefix>search all artifacts by hash prefix.schema_stub --hash <prefix> --run-id <run_id>narrows hash lookup to one run when you want deterministic selection.
Tips:
- Run
runsfirst to populate#<n>run shortcuts. - Run
artifacts <run_id>orartifacts #<n>first to populate@<n>artifact shortcuts.
Scripting with JSON Output¶
Use --json for machine-readable output: