ncaa_eval.model package¶
Submodules¶
- ncaa_eval.model.base module
- ncaa_eval.model.elo module
EloModelEloModelConfigEloModelConfig.early_game_thresholdEloModelConfig.home_advantage_eloEloModelConfig.initial_ratingEloModelConfig.k_earlyEloModelConfig.k_regularEloModelConfig.k_tournamentEloModelConfig.margin_exponentEloModelConfig.max_marginEloModelConfig.mean_reversion_fractionEloModelConfig.model_configEloModelConfig.model_name
- ncaa_eval.model.ensemble module
StackedEnsembleStackedEnsemble.base_modelsStackedEnsemble.meta_learnerStackedEnsemble.contextual_featuresStackedEnsemble.base_modelsStackedEnsemble.contextual_featuresStackedEnsemble.feature_configStackedEnsemble.get_config()StackedEnsemble.load()StackedEnsemble.meta_column_orderStackedEnsemble.meta_learnerStackedEnsemble.predict_bracket()StackedEnsemble.predict_proba()StackedEnsemble.save()
StackedEnsembleConfig
- ncaa_eval.model.logistic_regression module
- ncaa_eval.model.registry module
- ncaa_eval.model.tracking module
ModelRunPredictionRunStoreRunStore.list_runs()RunStore.load_all_summaries()RunStore.load_feature_names()RunStore.load_fold_predictions()RunStore.load_metrics()RunStore.load_model()RunStore.load_predictions()RunStore.load_run()RunStore.model_dir()RunStore.save_fold_predictions()RunStore.save_metrics()RunStore.save_model()RunStore.save_run()
- ncaa_eval.model.xgboost_model module
XGBoostModelXGBoostModelConfigXGBoostModelConfig.colsample_bytreeXGBoostModelConfig.early_stopping_roundsXGBoostModelConfig.learning_rateXGBoostModelConfig.max_depthXGBoostModelConfig.min_child_weightXGBoostModelConfig.model_configXGBoostModelConfig.model_nameXGBoostModelConfig.n_estimatorsXGBoostModelConfig.reg_alphaXGBoostModelConfig.reg_lambdaXGBoostModelConfig.scale_pos_weightXGBoostModelConfig.subsampleXGBoostModelConfig.validation_fraction
Module contents¶
Model implementations module.
- class ncaa_eval.model.Model[source]¶
Bases:
ABCAbstract base class for all NCAA prediction models.
Every model — stateful or stateless — must implement these five methods so that the training CLI, evaluation engine, and persistence layer can treat all models uniformly.
- feature_config¶
Declarative specification of which feature blocks the model expects. Set by subclass
__init__.
- feature_config: FeatureConfig¶
- abstractmethod fit(X: DataFrame, y: Series) None[source]¶
Train the model on feature matrix X and labels y.
- abstractmethod get_config() ModelConfig[source]¶
Return the Pydantic-validated configuration for this model.
- get_feature_importances() list[tuple[str, float]] | None[source]¶
Return feature name/importance pairs, or
Noneif unavailable.The default returns
None. Models that support feature importances (e.g. XGBoost) should override this method.
- class ncaa_eval.model.ModelConfig(*, model_name: str, calibration_method: Literal['isotonic', 'sigmoid'] | None = None)[source]¶
Bases:
BaseModelBase configuration shared by all model implementations.
Subclasses add model-specific hyperparameters as additional fields.
- calibration_method: CalibrationMethod | None¶
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_name: str¶
- exception ncaa_eval.model.ModelNotFoundError[source]¶
Bases:
KeyErrorRaised when a requested model name is not in the registry.
- class ncaa_eval.model.StackedEnsemble(base_models: list[~ncaa_eval.model.base.Model], meta_learner: ~ncaa_eval.model.base.Model, contextual_features: list[str] = <factory>, meta_column_order: list[str] = <factory>)[source]¶
Bases:
objectStacked generalisation ensemble.
Holds a list of base
Modelinstances and a stateless meta-learner. The ensemble’sfeature_configis the union of all base models’ configs.- base_models¶
Two or more trained (or to-be-trained) base models.
- Type:
- meta_learner¶
A stateless
Modelthat learns to combine base model predictions with contextual features.
- contextual_features¶
Column names appended to OOF predictions before meta-learner training (e.g.
seed_diff).- Type:
list[str]
- contextual_features: list[str]¶
- property feature_config: FeatureConfig¶
Return the union of all base model feature configs.
- get_config() StackedEnsembleConfig[source]¶
Return a serialisable configuration record.
- classmethod load(path: Path) StackedEnsemble[source]¶
Reconstruct a
StackedEnsemblefrom a saved directory.
- meta_column_order: list[str]¶
- predict_bracket(data_dir: Path, season: int) DataFrame[source]¶
Generate an n×n pairwise probability matrix for bracket prediction.
Discovers tournament-eligible teams, generates per-base-model pairwise predictions, assembles meta-input for all C(n,2) matchups, and returns a probability matrix suitable for the Monte Carlo bracket simulator.
- Parameters:
data_dir – Path to the local Parquet data store.
season – Target season year.
- Returns:
DataFrame of shape
(n, n)with team_id index and columns.P[a, b]is the ensemble probability that team a beats team b. Diagonal is zero;P[a,b] + P[b,a] ≈ 1.- Raises:
FileNotFoundError – If no season data exists for season.
ValueError – If any column in
meta_column_orderis missing from the assembled meta-input, or if a context feature array has unexpected length.
- predict_proba(X: DataFrame) Series[source]¶
Route features through base models and meta-learner.
For each base model, generates predictions by dispatching stateless models through
X[base_model.feature_names_]and stateful models through the fullX. Assembles base predictions and contextual features into a meta-input DataFrame inself.meta_column_order, then calls the meta-learner.- Parameters:
X – Feature DataFrame with at least the columns required by each base model and all contextual features.
- Returns:
Series of ensemble win probabilities, indexed like X.
- Raises:
ValueError – If any column in
meta_column_orderis missing from the assembled meta-input.
- class ncaa_eval.model.StatefulModel[source]¶
Bases:
ModelTemplate base for models that process games sequentially.
Concrete methods
fitandpredict_probaare provided as template methods. Subclasses implement the abstract hooks:update(game)— absorb a single game result_predict_one(team_a_id, team_b_id)— return P(team_a wins)start_season(season)— reset / prepare for a new seasonget_state()/set_state(state)— snapshot / restore ratings
- fit(X: DataFrame, y: Series) None[source]¶
Reconstruct games from X/y and update sequentially.
Reconstructs Game objects from the feature matrix and labels, then iterates chronologically, calling start_season() on season boundaries and update() per game.
- abstractmethod get_state() dict[str, Any][source]¶
Return a serialisable snapshot of internal ratings.
- predict_matchup(team_a_id: int, team_b_id: int) float[source]¶
Return P(team_a wins) for a single matchup.
Delegates to the
_predict_oneabstract hook.
- ncaa_eval.model.get_model(name: str) type[Model][source]¶
Return the model class registered under name.
Raises
ModelNotFoundErrorif not found.