ncaa_eval.model.tracking module¶

Model run tracking: metadata, predictions, and persistence.

Defines ModelRun and Prediction Pydantic records for run metadata and game-level predictions, plus RunStore for local JSON/Parquet persistence under base_path / "runs" / run_id /.

class ncaa_eval.model.tracking.ModelRun(*, run_id: str, model_type: str, hyperparameters: dict[str, ~typing.Any], timestamp: ~datetime.datetime = <factory>, git_hash: str, start_year: int, end_year: int, prediction_count: int)[source]¶

Bases: BaseModel

Metadata for a single model training run.

end_year: int¶

git_hash: str¶

hyperparameters: dict[str, Any]¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_type: str¶

prediction_count: int¶

run_id: str¶

start_year: int¶

timestamp: datetime¶

class ncaa_eval.model.tracking.Prediction(*, run_id: str, game_id: str, season: int, team_a_id: int, team_b_id: int, pred_win_prob: Annotated[float, Ge(ge=0.0), Le(le=1.0)])[source]¶

Bases: BaseModel

A single game-level probability prediction.

game_id: str¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pred_win_prob: Annotated[float, Field(ge=0.0, le=1.0)]¶

run_id: str¶

season: int¶

team_a_id: int¶

team_b_id: int¶

class ncaa_eval.model.tracking.RunStore(base_path: Path)[source]¶

Bases: object

Persist and load model runs and predictions on the local filesystem.

Directory layout:

base_path/
  runs/
    <run_id>/
      run.json                    # ModelRun metadata
      predictions.parquet         # Prediction records (PyArrow)
      summary.parquet             # BacktestResult.summary (year × metrics)
      fold_predictions.parquet    # CV fold y_true/y_prob per year
      model/                      # Trained model artifacts
        model.ubj                 # XGBoost native format (XGBoost only)
        model.json                # Elo ratings (Elo only)
        config.json               # Model config
        feature_names.json        # Feature column names used during training

list_runs() → list[ModelRun][source]¶

Scan the runs directory and return all saved ModelRun records.

Scans the runs directory in sorted order, deserializes each run.json file via Pydantic, and returns a list of ModelRun objects (empty list if directory does not exist).

load_all_summaries() → DataFrame[source]¶

Load metric summaries for all runs that have them.

Returns:: DataFrame with columns [run_id, year, log_loss, brier_score, roc_auc, ece, elapsed_seconds]. Empty DataFrame if no summaries.

load_feature_names(run_id: str) → list[str] | None[source]¶

Load saved feature names for a run.

Parameters:: run_id – The run identifier.
Returns:: List of feature names or None if not saved.

load_fold_predictions(run_id: str) → DataFrame | None[source]¶

Load fold-level predictions for a run.

Parameters:: run_id – The run identifier.
Returns:: DataFrame or None if no fold predictions exist (legacy run).

load_metrics(run_id: str) → DataFrame | None[source]¶

Load backtest metric summary for a run.

Parameters:: run_id – The run identifier.
Returns:: Summary DataFrame or None if no summary exists (legacy run).

load_model(run_id: str) → Model | StackedEnsemble | None[source]¶

Load a trained model from a run directory.

Delegates to StackedEnsemble.load() when model_type == "ensemble".

Parameters:: run_id – The run identifier.
Returns:: Model or StackedEnsemble instance, or None if no model directory exists (legacy run).

load_predictions(run_id: str) → DataFrame[source]¶

Load predictions from Parquet as a DataFrame.

Raises:: FileNotFoundError – If the predictions Parquet file does not exist.

load_run(run_id: str) → ModelRun[source]¶

Load run metadata from JSON.

Raises:: FileNotFoundError – If the run directory or run.json does not exist.

model_dir(run_id: str) → Path[source]¶: Return the model directory path for a run (creates it if absent).

save_fold_predictions(run_id: str, fold_preds: DataFrame) → None[source]¶

Persist fold-level predictions from walk-forward CV.

Parameters:

run_id – The run identifier.
fold_preds – DataFrame with columns [year, game_id, team_a_id, team_b_id, pred_win_prob, team_a_won].

Raises:

FileNotFoundError – If the run directory does not exist.

save_metrics(run_id: str, summary: DataFrame) → None[source]¶

Persist backtest metric summary for a run.

Parameters:

run_id – The run identifier.
summary – BacktestResult.summary DataFrame (index=year, columns=[log_loss, brier_score, roc_auc, ece, elapsed_seconds]).

Raises:

FileNotFoundError – If the run directory does not exist.

save_model(run_id: str, model: Model | StackedEnsemble, *, feature_names: list[str] | None = None) → None[source]¶

Persist a trained model alongside a run.

Parameters:

run_id – The run identifier.
model – A fitted model implementing save(path). Accepts both Model and StackedEnsemble instances.
feature_names – Feature column names used during training.

Raises:

FileNotFoundError – If the run directory does not exist.

save_run(run: ModelRun, predictions: list[Prediction]) → None[source]¶

Write run metadata (JSON) and predictions (Parquet).

Creates the run directory, JSON-writes the ModelRun metadata, and Parquet-writes prediction records using a pre-defined PyArrow schema, handling empty prediction lists by constructing typed empty arrays.