ncaa_eval.transform.elo module¶

Game-by-game Elo rating engine for NCAA basketball feature engineering.

Computes Elo ratings as a feature building block — the resulting per-team ratings feed into models (XGBoost, etc.) as input features. This module does NOT implement model-level train/predict/save interfaces; those belong in Story 5.3.

Key design points:

update_game() returns the before ratings, then mutates internal state, guaranteeing walk-forward temporal safety.
Variable K-factor: early-season → regular-season → tournament.
Margin-of-victory scaling with diminishing returns (Silver/SBCB formula).
Home-court adjustment subtracted from effective rating before computing expected outcome.
Season mean-reversion toward conference mean (or global mean as fallback).

class ncaa_eval.transform.elo.EloConfig(initial_rating: float = 1500.0, k_early: float = 56.0, k_regular: float = 38.0, k_tournament: float = 47.5, early_game_threshold: int = 20, margin_exponent: float = 0.85, max_margin: int = 25, home_advantage_elo: float = 3.5, mean_reversion_fraction: float = 0.25)[source]¶

Bases: object

Frozen configuration for the Elo feature engine.

All K-factor, margin scaling, home-court, and mean-reversion parameters are configurable with sensible defaults matching the Silver/SBCB model.

early_game_threshold: int = 20¶

home_advantage_elo: float = 3.5¶

initial_rating: float = 1500.0¶

k_early: float = 56.0¶

k_regular: float = 38.0¶

k_tournament: float = 47.5¶

margin_exponent: float = 0.85¶

max_margin: int = 25¶

mean_reversion_fraction: float = 0.25¶

class ncaa_eval.transform.elo.EloFeatureEngine(config: EloConfig, conference_lookup: ConferenceLookup | None = None)[source]¶

Bases: object

Game-by-game Elo rating engine.

Parameters:

config – Frozen Elo configuration.
conference_lookup – Optional conference lookup for season mean-reversion. When None, mean-reversion falls back to global mean.

apply_season_mean_reversion(season: int) → None[source]¶

Regress each team toward its conference mean (or global mean).

Groups all rated teams by conference via ConferenceLookup, computes each conference’s mean rating, then shifts every team’s rating a fraction mean_reversion_fraction of the way toward its conference mean. Teams with no conference entry fall back to the global mean; when no ConferenceLookup is provided all teams use the global mean. Is a no-op when no prior ratings exist.

static expected_score(rating_a: float, rating_b: float) → float[source]¶

Logistic expected score for team A against team B.

expected = 1 / (1 + 10^((r_b − r_a) / 400))

get_all_ratings() → dict[int, float][source]¶: Return a copy of the current ratings dict.

get_game_counts() → dict[int, int][source]¶: Return a copy of the current game-counts dict.

get_rating(team_id: int) → float[source]¶: Return current Elo rating for team_id (initial_rating if unseen).

has_ratings() → bool[source]¶: Return True if at least one team has a rating.

predict_matchup(team_a_id: int, team_b_id: int) → float[source]¶: Return P(team_a wins) using the Elo expected-score formula.

process_season(games: list[Game], season: int) → pd.DataFrame[source]¶

Process all games for a season, returning before-ratings per game.

Calls start_new_season(season) if prior ratings exist (i.e., this is not the very first season).

Parameters:

games – Games sorted in chronological order.
season – Season year.

Returns:

DataFrame with columns [game_id, elo_w_before, elo_l_before].

reset_game_counts() → None[source]¶: Reset per-team game counts for a new season (affects variable K).

set_game_counts(counts: dict[int, int]) → None[source]¶: Replace all game counts with counts.

set_ratings(ratings: dict[int, float]) → None[source]¶: Replace all ratings with ratings.

start_new_season(season: int) → None[source]¶: Orchestrate season transition: mean-reversion then reset counts.

update_game(w_team_id: int, l_team_id: int, w_score: int, l_score: int, loc: str, is_tournament: bool, *, num_ot: int = 0) → tuple[float, float][source]¶

Process one game and update ratings.

Snapshots before-ratings for feature use, applies home-court effective-rating adjustment to expected-score computation, computes the margin-of-victory multiplier and variable K-factor, then mutates internal rating state for both teams.

Parameters:

w_team_id – Winner team ID.
l_team_id – Loser team ID.
w_score – Winner final score (raw).
l_score – Loser final score (raw).
loc – "H" (winner home), "A" (winner away), "N" (neutral).
is_tournament – Whether this is a tournament game.
num_ot – Number of overtime periods (used for margin rescaling).

Returns:

Tuple of (elo_w_before, elo_l_before) — the winner’s and loser’s ratings before this game’s update, suitable for use as walk-forward feature values.