ncaa_eval.ingest.repository module

Repository pattern for NCAA basketball data storage.

Defines an abstract Repository interface and a concrete ParquetRepository implementation backed by Apache Parquet files. The abstraction lets downstream code remain storage-agnostic — a SQLite implementation can be plugged in later (Story 5.5) without changing any business logic.

class ncaa_eval.ingest.repository.ParquetRepository(base_path: Path)[source]

Bases: Repository

Repository implementation backed by Parquet files.

Directory layout:

{base_path}/
    teams.parquet
    seasons.parquet
    games/
        season={year}/
            data.parquet
get_games(season: int) list[Game][source]

Load games for a single season from hive-partitioned Parquet.

get_seasons() list[Season][source]

Load all season records from the seasons Parquet file.

get_teams() list[Team][source]

Load all teams from the teams Parquet file.

save_games(games: list[Game]) None[source]

Persist game records to hive-partitioned Parquet by season.

save_seasons(seasons: list[Season]) None[source]

Persist season records to a Parquet file.

save_teams(teams: list[Team]) None[source]

Persist team records to a Parquet file.

class ncaa_eval.ingest.repository.Repository[source]

Bases: ABC

Abstract base class for NCAA data persistence.

abstractmethod get_games(season: int) list[Game][source]

Return all games for a given season year.

abstractmethod get_seasons() list[Season][source]

Return all stored seasons.

abstractmethod get_teams() list[Team][source]

Return all stored teams.

abstractmethod save_games(games: list[Game]) None[source]

Persist a collection of games (overwrite per season partition).

abstractmethod save_seasons(seasons: list[Season]) None[source]

Persist a collection of seasons (overwrite).

abstractmethod save_teams(teams: list[Team]) None[source]

Persist a collection of teams (overwrite).