ncaa_eval.transform.feature_serving module

Declarative feature serving layer for NCAA basketball prediction.

Combines sequential, graph, batch-rating, ordinal, seed, and Elo feature building blocks into a temporally-safe, matchup-level feature matrix.

class ncaa_eval.transform.feature_serving.FeatureBlock(*values)[source]

Bases: Enum

Individual feature building blocks that can be activated.

BATCH_RATING = 'batch_rating'
ELO = 'elo'
GRAPH = 'graph'
ORDINAL = 'ordinal'
SEED = 'seed'
SEQUENTIAL = 'sequential'
class ncaa_eval.transform.feature_serving.FeatureConfig(sequential_windows: tuple[int, ...] = (5, 10, 20), ewma_alphas: tuple[float, ...] = (0.15, 0.2), graph_features_enabled: bool = True, batch_rating_types: tuple[BatchRatingType, ...] = ('srs', 'ridge', 'colley'), ordinal_systems: tuple[str, ...] | None = None, ordinal_composite: OrdinalCompositeMethod | None = 'simple_average', matchup_deltas: bool = True, gender_scope: GenderScope = 'M', dataset_scope: DatasetScope = 'kaggle', elo_enabled: bool = False, elo_config: EloConfig | None = None)[source]

Bases: object

Declarative specification of which feature blocks and parameters to use.

sequential_windows

Rolling window sizes for sequential features (e.g., (5, 10, 20)).

Type:

tuple[int, …]

ewma_alphas

EWMA smoothing factors for sequential features (e.g., (0.15, 0.20)).

Type:

tuple[float, …]

graph_features_enabled

Whether to compute graph centrality features (PageRank, etc.).

Type:

bool

batch_rating_types

Which batch rating systems to include ("srs", "ridge", "colley").

Type:

tuple[BatchRatingType, …]

ordinal_systems

Massey ordinal systems to use; None means use coverage-gate defaults.

Type:

tuple[str, …] | None

ordinal_composite

Composite method: "simple_average", "weighted", "pca", or None to disable.

Type:

OrdinalCompositeMethod | None

matchup_deltas

Whether to compute team_A − team_B deltas for matchup features.

Type:

bool

gender_scope

"M" for men’s, "W" for women’s.

Type:

GenderScope

dataset_scope

"kaggle" for Kaggle-only games, "all" for Kaggle + ESPN enrichment.

Type:

DatasetScope

active_blocks() frozenset[FeatureBlock][source]

Return the set of feature blocks that are currently enabled.

Checks each configuration flag (sequential windows, graph enabled, batch rating types, ordinal composite, Elo enabled) and adds the corresponding FeatureBlock enum value to a set, with SEED always included.

batch_rating_types: tuple[BatchRatingType, ...] = ('srs', 'ridge', 'colley')
dataset_scope: DatasetScope = 'kaggle'
elo_config: EloConfig | None = None
elo_enabled: bool = False
ewma_alphas: tuple[float, ...] = (0.15, 0.2)
gender_scope: GenderScope = 'M'
graph_features_enabled: bool = True
matchup_deltas: bool = True
ordinal_composite: OrdinalCompositeMethod | None = 'simple_average'
ordinal_systems: tuple[str, ...] | None = None
sequential_windows: tuple[int, ...] = (5, 10, 20)
class ncaa_eval.transform.feature_serving.StatefulFeatureServer(config: FeatureConfig, data_server: ChronologicalDataServer, *, seed_table: TourneySeedTable | None = None, ordinals_store: MasseyOrdinalsStore | None = None, elo_engine: EloFeatureEngine | None = None)[source]

Bases: object

Combines feature building blocks into a single feature matrix.

Supports two consumption modes:

  • batch — compute all features for an entire season at once (suitable for stateless models like XGBoost).

  • stateful — iterate game-by-game, accumulating state incrementally (suitable for Elo-style models; placeholder until Story 4.8).

Parameters:
  • config – Declarative specification of which feature blocks to activate.

  • data_server – Chronological data serving layer wrapping the Repository.

  • seed_table – Tournament seed lookup table (optional; needed for seed features).

  • ordinals_store – Massey ordinals store (optional; needed for ordinal features).

  • elo_engine – Elo feature engine (optional; needed when elo_enabled=True).

serve_season_features(year: int, mode: Literal['batch', 'stateful'] = 'batch') DataFrame[source]

Build the feature matrix for a full season.

Parameters:
  • year – Season year (e.g. 2023 for the 2022-23 season).

  • mode"batch" or "stateful".

Returns:

One row per game with metadata, feature deltas, and the target label.