Artifact¶
Artifact represents a tracked input or output file with stable provenance
metadata (run_id, hash, container_uri, driver, and custom metadata).
When to use Artifact directly¶
- You want to pass outputs from one step to another (
inputs={...}mappings). - You need a portable path via
artifact.pathrather than hard-coded file paths. - You need metadata checks (
artifact.get_meta(...),artifact.is_tabular,artifact.is_matrix) before loading.
Minimal runnable example¶
from pathlib import Path
import consist
from consist import Tracker
tracker = Tracker(run_dir="./runs", db_path="./provenance.duckdb")
def write_output() -> Path:
out = consist.output_path("report", ext="txt")
out.write_text("hello\n")
return out
with consist.use_tracker(tracker):
result = consist.run(fn=write_output, outputs=["report"])
artifact = result.outputs["report"]
print(artifact.key)
print(artifact.path)
print(artifact.path.read_text().strip())
See API Helpers for helper functions that return and consume
artifacts (consist.run, consist.ref, consist.refs, consist.load*).
Bases: SQLModel
Represents a physical data object in the Consist database.
This table stores canonical metadata for any file/dataset Consist tracks. It is
linked to runs via run_artifact_link to record whether an artifact was an
input or output. The run_id field records the producing run (if any) and
is often None for external inputs.
Artifacts are the core building blocks of provenance and caching. Each artifact has a unique identity, a virtualized location, and rich metadata, supporting both "hot" (ingested) and "cold" (file-based) data strategies.
Attributes: id (uuid.UUID): A unique identifier for the artifact. key (str): A semantic, human-readable name for the artifact (e.g., "households", "parcels"). container_uri (str): A portable, virtualized Uniform Resource Identifier (URI) for the artifact's location (e.g., "inputs://land_use.csv"). table_path (Optional[str]): Optional path inside a container (e.g., "/tables/households"). array_path (Optional[str]): Optional path inside a container for array artifacts. driver (str): The name of the format handler used to read or write the artifact (e.g., "parquet", "csv", "zarr"). hash (Optional[str]): SHA256 content hash of the artifact's data, enabling content-addressable lookups and deduplication. run_id (Optional[str]): The ID of the run that generated this artifact. Null for inputs. meta (Dict[str, Any]): A flexible JSON field for storing arbitrary metadata, such as schema signatures, or data dimensions. created_at (datetime): The timestamp when the artifact was first logged.
abs_path
property
writable
¶
Runtime-only helper to access the absolute path of this artifact.
This property provides the resolved absolute file system path for the artifact. It is not persisted to the database but is crucial for local file operations and for chaining Consist runs within the same script or environment.
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The absolute file system path of the artifact, or |
path
property
¶
Resolve this artifact to a filesystem Path.
Uses the tracker when available to handle mount-aware URIs; otherwise falls back to the cached absolute path or the raw URI.
is_matrix
property
¶
Indicates if the artifact represents a multi-dimensional array or matrix-like data.
This property helps in dispatching to appropriate data loaders or processing functions that handle array-based data structures, such as those typically found in scientific computing.
Returns:
| Type | Description |
|---|---|
bool
|
True if the artifact's driver is associated with matrix-like data formats (e.g., Zarr, HDF5, NetCDF, OpenMatrix), False otherwise. |
is_tabular
property
¶
Indicates if the artifact represents tabular data (rows and columns).
This property assists in identifying artifacts that can be loaded and processed using tools designed for structured, record-based data, such as Pandas DataFrames.
Returns:
| Type | Description |
|---|---|
bool
|
True if the artifact's driver is associated with tabular data formats (e.g., Parquet, CSV, SQL), False otherwise. |
created_at_iso
property
¶
get_meta(key, default=None)
¶
Safely retrieves a value from the 'meta' dictionary.
Args: key (str): The key to look up in the metadata. default (Any, optional): The default value to return if the key is not found. Defaults to None.
Returns: Any: The value associated with the key, or the default value if the key is not present.