ncaa_eval.transform.feature_serving module¶
Declarative feature serving layer for NCAA basketball prediction.
Combines sequential, graph, batch-rating, ordinal, seed, and Elo feature building blocks into a temporally-safe, matchup-level feature matrix.
- class ncaa_eval.transform.feature_serving.FeatureBlock(*values)[source]¶
Bases:
EnumIndividual feature building blocks that can be activated.
- BATCH_RATING = 'batch_rating'¶
- ELO = 'elo'¶
- GRAPH = 'graph'¶
- ORDINAL = 'ordinal'¶
- SEED = 'seed'¶
- SEQUENTIAL = 'sequential'¶
- class ncaa_eval.transform.feature_serving.FeatureConfig(sequential_windows: tuple[int, ...] = (5, 10, 20), ewma_alphas: tuple[float, ...] = (0.15, 0.2), graph_features_enabled: bool = True, batch_rating_types: tuple[BatchRatingType, ...] = ('srs', 'ridge', 'colley'), ordinal_systems: tuple[str, ...] | None = None, ordinal_composite: OrdinalCompositeMethod | None = 'simple_average', matchup_deltas: bool = True, gender_scope: GenderScope = 'M', dataset_scope: DatasetScope = 'kaggle', elo_enabled: bool = False, elo_config: EloConfig | None = None)[source]¶
Bases:
objectDeclarative specification of which feature blocks and parameters to use.
- sequential_windows¶
Rolling window sizes for sequential features (e.g.,
(5, 10, 20)).- Type:
tuple[int, …]
- ewma_alphas¶
EWMA smoothing factors for sequential features (e.g.,
(0.15, 0.20)).- Type:
tuple[float, …]
- graph_features_enabled¶
Whether to compute graph centrality features (PageRank, etc.).
- Type:
bool
- batch_rating_types¶
Which batch rating systems to include (
"srs","ridge","colley").- Type:
tuple[BatchRatingType, …]
- ordinal_systems¶
Massey ordinal systems to use;
Nonemeans use coverage-gate defaults.- Type:
tuple[str, …] | None
- ordinal_composite¶
Composite method:
"simple_average","weighted","pca", orNoneto disable.- Type:
OrdinalCompositeMethod | None
- matchup_deltas¶
Whether to compute team_A − team_B deltas for matchup features.
- Type:
bool
- gender_scope¶
"M"for men’s,"W"for women’s.- Type:
GenderScope
- dataset_scope¶
"kaggle"for Kaggle-only games,"all"for Kaggle + ESPN enrichment.- Type:
DatasetScope
- active_blocks() frozenset[FeatureBlock][source]¶
Return the set of feature blocks that are currently enabled.
Checks each configuration flag (sequential windows, graph enabled, batch rating types, ordinal composite, Elo enabled) and adds the corresponding FeatureBlock enum value to a set, with SEED always included.
- batch_rating_types: tuple[BatchRatingType, ...] = ('srs', 'ridge', 'colley')¶
- dataset_scope: DatasetScope = 'kaggle'¶
- elo_enabled: bool = False¶
- ewma_alphas: tuple[float, ...] = (0.15, 0.2)¶
- gender_scope: GenderScope = 'M'¶
- graph_features_enabled: bool = True¶
- matchup_deltas: bool = True¶
- ordinal_composite: OrdinalCompositeMethod | None = 'simple_average'¶
- ordinal_systems: tuple[str, ...] | None = None¶
- sequential_windows: tuple[int, ...] = (5, 10, 20)¶
- class ncaa_eval.transform.feature_serving.StatefulFeatureServer(config: FeatureConfig, data_server: ChronologicalDataServer, *, seed_table: TourneySeedTable | None = None, ordinals_store: MasseyOrdinalsStore | None = None, elo_engine: EloFeatureEngine | None = None)[source]¶
Bases:
objectCombines feature building blocks into a single feature matrix.
Supports two consumption modes:
batch — compute all features for an entire season at once (suitable for stateless models like XGBoost).
stateful — iterate game-by-game, accumulating state incrementally (suitable for Elo-style models; placeholder until Story 4.8).
- Parameters:
config – Declarative specification of which feature blocks to activate.
data_server – Chronological data serving layer wrapping the Repository.
seed_table – Tournament seed lookup table (optional; needed for seed features).
ordinals_store – Massey ordinals store (optional; needed for ordinal features).
elo_engine – Elo feature engine (optional; needed when
elo_enabled=True).
- serve_season_features(year: int, mode: Literal['batch', 'stateful'] = 'batch') DataFrame[source]¶
Build the feature matrix for a full season.
- Parameters:
year – Season year (e.g. 2023 for the 2022-23 season).
mode –
"batch"or"stateful".
- Returns:
One row per game with metadata, feature deltas, and the target label.