RunSet and Alignment¶
RunSet is a fluent multi-run analysis primitive built on top of tracker
query APIs. Use it when you need to:
- Partition runs by a field/facet (
split_by) - Filter runs by mixed run-field/facet predicates (
filter) - Keep latest runs globally or per group (
latest) - Align two run collections for 1:1 comparison (
align)
Minimal example¶
from consist import RunSet, Tracker
tracker = Tracker(run_dir="./runs", db_path="./provenance.duckdb")
baseline = RunSet.from_query(tracker, label="baseline", parent_id="base")
policy = RunSet.from_query(tracker, label="policy", parent_id="policy")
pair = baseline.align(policy, on="year")
diffs = pair.config_diffs(namespace="beam")
API reference¶
consist.runset.RunSet
dataclass
¶
Ordered run collection with grouping and alignment helpers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runs
|
List[Run]
|
Run records included in this collection. |
required |
label
|
Optional[str]
|
Optional descriptive label propagated to derived RunSets and
|
None
|
Notes
Methods are non-destructive. Operations like filter, latest, and
split_by return new RunSet instances.
from_query(tracker, label=None, **filters)
classmethod
¶
Build a tracker-backed RunSet from Tracker.find_runs filters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tracker
|
Tracker
|
Tracker used to execute the query and resolve facet fields. |
required |
label
|
Optional[str]
|
Optional label for the returned RunSet. |
None
|
**filters
|
Any
|
Keyword filters forwarded directly to |
{}
|
Returns:
| Type | Description |
|---|---|
RunSet
|
Tracker-backed RunSet containing matching runs. |
from_runs(runs, label=None)
classmethod
¶
Build a RunSet from an existing iterable of runs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runs
|
Iterable[Run]
|
Source run objects. |
required |
label
|
Optional[str]
|
Optional label for the returned RunSet. |
None
|
Returns:
| Type | Description |
|---|---|
RunSet
|
New RunSet containing the provided runs. Field-based helpers work on
these sets, but facet-based helpers require a tracker-backed RunSet
created with |
Notes
Use this constructor when you already have concrete Run objects and
only need field-based operations such as positional access or grouping on
built-in run attributes like year or status. If you need facet-
aware helpers such as filter(scenario=...) or split_by("seed"),
build the RunSet from a tracker-backed query instead so facet values can
be loaded from the provenance store.
split_by(field)
¶
Partition runs into keyed sub-RunSets by field or facet value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
field
|
str
|
Run field (for example |
required |
Returns:
| Type | Description |
|---|---|
Dict[Any, RunSet]
|
Ordered dict keyed by the resolved field value, sorted ascending.
Missing values are grouped under |
filter(**field_values)
¶
latest(group_by=None)
¶
Keep the most recent run by created_at globally or per group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
group_by
|
Optional[List[str]]
|
Grouping fields/facet keys. When omitted, returns a single-run RunSet containing the overall latest run. |
None
|
Returns:
| Type | Description |
|---|---|
RunSet
|
New RunSet containing latest run(s) for each group. |
align(other, on)
¶
Align two RunSets 1:1 on a shared field or facet key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
RunSet
|
Comparison RunSet. |
required |
on
|
str
|
Alignment key. Can reference a Run field or facet key. |
required |
Returns:
| Type | Description |
|---|---|
AlignedPair
|
Pair object containing only keys present on both sides, in sorted order. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If either side has duplicate values for |
to_frame()
¶
Materialize a run summary DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame
|
One row per run with base columns:
|
__iter__()
¶
Iterate over runs in collection order.
__len__()
¶
Return the number of runs in the collection.
__getitem__(index)
¶
Return run at positional index.
consist.runset.AlignedPair
dataclass
¶
Two RunSets matched 1:1 along a shared field/facet dimension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
on
|
str
|
Field/facet key used for alignment. |
required |
left
|
RunSet
|
Left-hand RunSet with keys ordered to match |
required |
right
|
RunSet
|
Right-hand RunSet with keys ordered to match |
required |
keys
|
List[Any]
|
Shared alignment key values present in both RunSets. |
required |
pairs()
¶
Iterate over matched (left_run, right_run) pairs.
Yields:
| Type | Description |
|---|---|
tuple[Run, Run]
|
Pair of aligned runs. |
apply(fn)
¶
Apply a pairwise function over aligned runs and concatenate results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fn
|
Callable[[Run, Run, Any], DataFrame]
|
Function called as |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Concatenated DataFrame with an added |
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
config_diffs(namespace=None, prefix=None)
¶
Compute config diffs for each aligned pair using Tracker.diff_runs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
namespace
|
Optional[str]
|
Namespace passed to |
None
|
prefix
|
Optional[str]
|
Optional key prefix filter passed to |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns: |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If neither RunSet is tracker-backed. |
to_frame()
¶
Materialize aligned-pair summary rows.
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns:
|