ncaa_eval.model.tracking module

Model run tracking: metadata, predictions, and persistence.

Defines ModelRun and Prediction Pydantic records for run metadata and game-level predictions, plus RunStore for local JSON/Parquet persistence under base_path / "runs" / run_id /.

class ncaa_eval.model.tracking.ModelRun(*, run_id: str, model_type: str, hyperparameters: dict[str, ~typing.Any], timestamp: ~datetime.datetime = <factory>, git_hash: str, start_year: int, end_year: int, prediction_count: int)[source]

Bases: BaseModel

Metadata for a single model training run.

end_year: int
git_hash: str
hyperparameters: dict[str, Any]
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_type: str
prediction_count: int
run_id: str
start_year: int
timestamp: datetime
class ncaa_eval.model.tracking.Prediction(*, run_id: str, game_id: str, season: int, team_a_id: int, team_b_id: int, pred_win_prob: Annotated[float, Ge(ge=0.0), Le(le=1.0)])[source]

Bases: BaseModel

A single game-level probability prediction.

game_id: str
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pred_win_prob: Annotated[float, Field(ge=0.0, le=1.0)]
run_id: str
season: int
team_a_id: int
team_b_id: int
class ncaa_eval.model.tracking.RunStore(base_path: Path)[source]

Bases: object

Persist and load model runs and predictions on the local filesystem.

Directory layout:

base_path/
  runs/
    <run_id>/
      run.json                    # ModelRun metadata
      predictions.parquet         # Prediction records (PyArrow)
      summary.parquet             # BacktestResult.summary (year × metrics)
      fold_predictions.parquet    # CV fold y_true/y_prob per year
      model/                      # Trained model artifacts
        model.ubj                 # XGBoost native format (XGBoost only)
        model.json                # Elo ratings (Elo only)
        config.json               # Model config
        feature_names.json        # Feature column names used during training
list_runs() list[ModelRun][source]

Scan the runs directory and return all saved ModelRun records.

Scans the runs directory in sorted order, deserializes each run.json file via Pydantic, and returns a list of ModelRun objects (empty list if directory does not exist).

load_all_summaries() DataFrame[source]

Load metric summaries for all runs that have them.

Returns:

DataFrame with columns [run_id, year, log_loss, brier_score, roc_auc, ece, elapsed_seconds]. Empty DataFrame if no summaries.

load_feature_names(run_id: str) list[str] | None[source]

Load saved feature names for a run.

Parameters:

run_id – The run identifier.

Returns:

List of feature names or None if not saved.

load_fold_predictions(run_id: str) DataFrame | None[source]

Load fold-level predictions for a run.

Parameters:

run_id – The run identifier.

Returns:

DataFrame or None if no fold predictions exist (legacy run).

load_metrics(run_id: str) DataFrame | None[source]

Load backtest metric summary for a run.

Parameters:

run_id – The run identifier.

Returns:

Summary DataFrame or None if no summary exists (legacy run).

load_model(run_id: str) Model | StackedEnsemble | None[source]

Load a trained model from a run directory.

Delegates to StackedEnsemble.load() when model_type == "ensemble".

Parameters:

run_id – The run identifier.

Returns:

Model or StackedEnsemble instance, or None if no model directory exists (legacy run).

load_predictions(run_id: str) DataFrame[source]

Load predictions from Parquet as a DataFrame.

Raises:

FileNotFoundError – If the predictions Parquet file does not exist.

load_run(run_id: str) ModelRun[source]

Load run metadata from JSON.

Raises:

FileNotFoundError – If the run directory or run.json does not exist.

model_dir(run_id: str) Path[source]

Return the model directory path for a run (creates it if absent).

save_fold_predictions(run_id: str, fold_preds: DataFrame) None[source]

Persist fold-level predictions from walk-forward CV.

Parameters:
  • run_id – The run identifier.

  • fold_preds – DataFrame with columns [year, game_id, team_a_id, team_b_id, pred_win_prob, team_a_won].

Raises:

FileNotFoundError – If the run directory does not exist.

save_metrics(run_id: str, summary: DataFrame) None[source]

Persist backtest metric summary for a run.

Parameters:
  • run_id – The run identifier.

  • summary – BacktestResult.summary DataFrame (index=year, columns=[log_loss, brier_score, roc_auc, ece, elapsed_seconds]).

Raises:

FileNotFoundError – If the run directory does not exist.

save_model(run_id: str, model: Model | StackedEnsemble, *, feature_names: list[str] | None = None) None[source]

Persist a trained model alongside a run.

Parameters:
  • run_id – The run identifier.

  • model – A fitted model implementing save(path). Accepts both Model and StackedEnsemble instances.

  • feature_names – Feature column names used during training.

Raises:

FileNotFoundError – If the run directory does not exist.

save_run(run: ModelRun, predictions: list[Prediction]) None[source]

Write run metadata (JSON) and predictions (Parquet).

Creates the run directory, JSON-writes the ModelRun metadata, and Parquet-writes prediction records using a pre-defined PyArrow schema, handling empty prediction lists by constructing typed empty arrays.