ncaa_eval.transform.elo module¶
Game-by-game Elo rating engine for NCAA basketball feature engineering.
Computes Elo ratings as a feature building block — the resulting per-team
ratings feed into models (XGBoost, etc.) as input features. This module does
NOT implement model-level train/predict/save interfaces; those
belong in Story 5.3.
Key design points:
update_game()returns the before ratings, then mutates internal state, guaranteeing walk-forward temporal safety.Variable K-factor: early-season → regular-season → tournament.
Margin-of-victory scaling with diminishing returns (Silver/SBCB formula).
Home-court adjustment subtracted from effective rating before computing expected outcome.
Season mean-reversion toward conference mean (or global mean as fallback).
- class ncaa_eval.transform.elo.EloConfig(initial_rating: float = 1500.0, k_early: float = 56.0, k_regular: float = 38.0, k_tournament: float = 47.5, early_game_threshold: int = 20, margin_exponent: float = 0.85, max_margin: int = 25, home_advantage_elo: float = 3.5, mean_reversion_fraction: float = 0.25)[source]¶
Bases:
objectFrozen configuration for the Elo feature engine.
All K-factor, margin scaling, home-court, and mean-reversion parameters are configurable with sensible defaults matching the Silver/SBCB model.
- early_game_threshold: int = 20¶
- home_advantage_elo: float = 3.5¶
- initial_rating: float = 1500.0¶
- k_early: float = 56.0¶
- k_regular: float = 38.0¶
- k_tournament: float = 47.5¶
- margin_exponent: float = 0.85¶
- max_margin: int = 25¶
- mean_reversion_fraction: float = 0.25¶
- class ncaa_eval.transform.elo.EloFeatureEngine(config: EloConfig, conference_lookup: ConferenceLookup | None = None)[source]¶
Bases:
objectGame-by-game Elo rating engine.
- Parameters:
config – Frozen Elo configuration.
conference_lookup – Optional conference lookup for season mean-reversion. When
None, mean-reversion falls back to global mean.
- apply_season_mean_reversion(season: int) None[source]¶
Regress each team toward its conference mean (or global mean).
Groups all rated teams by conference via
ConferenceLookup, computes each conference’s mean rating, then shifts every team’s rating a fractionmean_reversion_fractionof the way toward its conference mean. Teams with no conference entry fall back to the global mean; when noConferenceLookupis provided all teams use the global mean. Is a no-op when no prior ratings exist.
- static expected_score(rating_a: float, rating_b: float) float[source]¶
Logistic expected score for team A against team B.
expected = 1 / (1 + 10^((r_b − r_a) / 400))
- get_rating(team_id: int) float[source]¶
Return current Elo rating for team_id (initial_rating if unseen).
- predict_matchup(team_a_id: int, team_b_id: int) float[source]¶
Return P(team_a wins) using the Elo expected-score formula.
- process_season(games: list[Game], season: int) pd.DataFrame[source]¶
Process all games for a season, returning before-ratings per game.
Calls
start_new_season(season)if prior ratings exist (i.e., this is not the very first season).- Parameters:
games – Games sorted in chronological order.
season – Season year.
- Returns:
DataFrame with columns
[game_id, elo_w_before, elo_l_before].
- start_new_season(season: int) None[source]¶
Orchestrate season transition: mean-reversion then reset counts.
- update_game(w_team_id: int, l_team_id: int, w_score: int, l_score: int, loc: str, is_tournament: bool, *, num_ot: int = 0) tuple[float, float][source]¶
Process one game and update ratings.
Snapshots before-ratings for feature use, applies home-court effective-rating adjustment to expected-score computation, computes the margin-of-victory multiplier and variable K-factor, then mutates internal rating state for both teams.
- Parameters:
w_team_id – Winner team ID.
l_team_id – Loser team ID.
w_score – Winner final score (raw).
l_score – Loser final score (raw).
loc –
"H"(winner home),"A"(winner away),"N"(neutral).is_tournament – Whether this is a tournament game.
num_ot – Number of overtime periods (used for margin rescaling).
- Returns:
Tuple of
(elo_w_before, elo_l_before)— the winner’s and loser’s ratings before this game’s update, suitable for use as walk-forward feature values.