ncaa_eval.evaluation.splitter module¶

Walk-forward cross-validation splitter with Leave-One-Tournament-Out folds.

Provides walk_forward_splits(), which partitions historical game data into train/test folds where each fold uses one tournament year as the test set and all prior years as training data. The 2020 COVID year is handled gracefully: its regular-season data is included in training but no test fold is yielded (the tournament was cancelled).

class ncaa_eval.evaluation.splitter.CVFold(train: DataFrame, test: DataFrame, year: int)[source]¶

Bases: object

A single cross-validation fold.

train¶

All games from seasons strictly before the test year.

Type:: pandas.core.frame.DataFrame

test¶

Tournament games only from the test year.

Type:: pandas.core.frame.DataFrame

year¶

The test season year.

Type:: int

test: DataFrame¶

train: DataFrame¶

year: int¶

ncaa_eval.evaluation.splitter.walk_forward_splits(seasons: Sequence[int], feature_server: StatefulFeatureServer, *, mode: Literal['batch', 'stateful'] = 'batch') → Iterator[CVFold][source]¶

Generate walk-forward CV folds with Leave-One-Tournament-Out splits.

Parameters:

seasons – Ordered sequence of season years to include (e.g., range(2008, 2026)). Must contain at least 2 seasons.
feature_server – Configured StatefulFeatureServer for building feature matrices.
mode – Feature serving mode: "batch" (stateless models) or "stateful" (sequential-update models like Elo).

Yields:

CVFold – For each eligible test year (skipping no-tournament years like 2020): train contains all games from seasons strictly before the test year; test contains only tournament games from the test year; year is the test season year.

Raises:

ValueError – If seasons has fewer than 2 elements, or if mode is not "batch" or "stateful".