Views¶
View Registry¶
Registry for dynamic view classes. Accessing a view (e.g. registry.Person) automatically refreshes the underlying DuckDB SQL definition to include new files.
Use register(model, key=...) to add SQLModel schemas. Accessing the
attribute returns a dynamic SQLModel view class that can be queried via
select(...).
register(model, key=None)
¶
View Factory¶
A factory class responsible for generating "Hybrid Views" in DuckDB, acting as Consist's "The Virtualizer" component.
Hybrid Views combine data from materialized tables (often ingested via dlt) with data directly from file-based artifacts (e.g., Parquet, CSV), providing a unified SQL interface to query both "hot" and "cold" data transparently. This approach is central to Consist's flexible data access strategy.
Attributes:
| Name | Type | Description |
|---|---|---|
tracker |
Tracker
|
An instance of the Consist |
create_view_from_model(model, key=None)
¶
Creates both the SQL View and the Python SQLModel class for a given schema.
create_hybrid_view(view_name, concept_key, driver_filter=None, schema_model=None)
¶
Creates or replaces a DuckDB SQL VIEW that combines "hot" and "cold" data for a given concept.
This method generates a "Hybrid View" which allows transparent querying across
different data storage types. It implements "View Optimization" by leveraging
DuckDB's capabilities for vectorized reads from files. The resulting view uses
UNION ALL BY NAME to gracefully handle "Schema Evolution" (different columns
across runs or data sources) by nulling out missing columns.
"Hot" data refers to records already materialized into a DuckDB table (e.g., via ingestion). "Cold" data refers to records still residing in file-based artifacts (e.g., Parquet, CSV). Identifiers are quoted for SQL safety; missing cold-file paths are skipped at view creation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_name
|
str
|
The name to assign to the newly created or replaced SQL view. This is the name you will use in your SQL queries to access the combined data. |
required |
concept_key
|
str
|
The semantic key identifying the data concept (e.g., "households", "transactions"). Artifacts and materialized tables matching this key will be included in the view. |
required |
driver_filter
|
Optional[List[str]]
|
An optional list of artifact drivers (e.g., "parquet", "csv") to include
when querying "cold" data. If |
None
|
schema_model
|
Type[SQLModel]
|
SQL table definition for underlying data |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the view creation was attempted (even if the view ends up empty), False otherwise. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the |
create_grouped_hybrid_view(*, view_name, schema_id=None, schema_ids=None, schema_compatible=False, predicates=None, namespace=None, drivers=None, attach_facets=None, include_system_columns=True, mode='hybrid', if_exists='replace', missing_files='warn', run_id=None, parent_run_id=None, model=None, status=None, year=None, iteration=None)
¶
Create a selector-driven hybrid view across many artifacts.
This method powers schema-family analysis views where artifacts may have
different keys but represent the same logical table. Selection is based
on a required schema_id plus optional facet/run predicates.
The resulting SQL view can combine:
- hot rows from ingested global_tables.* relations,
- cold rows from files (currently parquet/csv readers),
- optional typed facet_* projection columns,
- optional Consist system columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_name
|
str
|
Name of the SQL view to create. |
required |
schema_id
|
Optional[str]
|
Primary selector for a single schema id. |
None
|
schema_ids
|
Optional[List[str]]
|
Alternative selector for multiple schema ids.
This is mainly used by higher-level model-class resolution in
|
None
|
schema_compatible
|
bool
|
If True, include artifacts observed with schema variants deemed compatible by field-name subset/superset matching. |
False
|
predicates
|
Optional[List[Dict[str, Any]]]
|
Parsed ArtifactKV predicates (as produced by
|
None
|
namespace
|
Optional[str]
|
Default ArtifactKV namespace used when a predicate does not provide one explicitly. |
None
|
drivers
|
Optional[List[str]]
|
Optional artifact-driver filter (e.g., |
None
|
attach_facets
|
Optional[List[str]]
|
Facet key paths to expose as typed columns named |
None
|
include_system_columns
|
bool
|
If True, include |
True
|
mode
|
(hybrid, hot_only, cold_only)
|
Controls which storage tier(s) are included in the view. |
"hybrid"
|
if_exists
|
(replace, error)
|
View creation behavior when |
"replace"
|
missing_files
|
(warn, error, skip_silent)
|
Policy for selected cold artifacts whose files no longer exist. |
"warn"
|
run_id
|
Optional[str]
|
Optional exact run-id filter. |
None
|
parent_run_id
|
Optional[str]
|
Optional parent/scenario run-id filter. |
None
|
model
|
Optional[str]
|
Optional run model-name filter. |
None
|
status
|
Optional[str]
|
Optional run status filter. |
None
|
year
|
Optional[int]
|
Optional run year filter. |
None
|
iteration
|
Optional[int]
|
Optional run iteration filter. |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the tracker is not configured with a database/engine. |
ValueError
|
If policy arguments have unsupported values, or |
FileNotFoundError
|
If |
Notes
- Empty selections still produce a valid typed empty view.
- Facet column types are inferred deterministically from indexed KV types: bool -> BOOLEAN, int -> BIGINT, float/int mix -> DOUBLE, otherwise VARCHAR.