fplx¶
fplx
¶
FPLX - Fantasy Premier League Time-Series Analysis & Squad Optimization
A production-ready Python library for: - FPL player time-series data analysis - News & injury signal integration - Expected performance scoring - Optimal 15-player squad and 11-player lineup selection
FPLModel
¶
FPLModel(
budget: float = 100.0,
horizon: int = 1,
formation: str = "auto",
config: Optional[dict] = None,
)
High-level interface for FPL analysis and squad optimization.
This is the main user-facing API. It orchestrates data loading, feature engineering, model fitting, and squad optimization.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Maximum squad budget (default 100.0)
TYPE:
|
horizon
|
Prediction horizon in gameweeks (default 1)
TYPE:
|
formation
|
Desired formation, or "auto" for optimization
TYPE:
|
config
|
Custom configuration
TYPE:
|
Examples:
>>> from fplx import FPLModel
>>> model = FPLModel(budget=100, horizon=1)
>>> model.load_data()
>>> model.fit()
>>> squad = model.select_best_11()
>>> squad.summary()
Source code in fplx/api/interface.py
load_data
¶
Load player and fixture data.
| PARAMETER | DESCRIPTION |
|---|---|
source
|
Data source: 'api' or 'local'
TYPE:
|
path
|
Path to local data (if source is 'local')
TYPE:
|
Source code in fplx/api/interface.py
fit
¶
Fit the prediction model.
Uses the probabilistic inference pipeline (HMM + Kalman + Fusion) when model_type is 'inference'. Falls back to the original feature engineering pipeline for baseline/ML models.
Source code in fplx/api/interface.py
select_best_11
¶
select_best_11() -> FullSquad
Select the optimal 15-player squad and 11-player starting lineup.
| RETURNS | DESCRIPTION |
|---|---|
FullSquad
|
The optimized squad with lineup. |
Source code in fplx/api/interface.py
Matchweek
dataclass
¶
Matchweek(
gameweek: int,
date: datetime,
fixtures: list[dict],
team_difficulty: dict[str, float],
)
Represents a matchweek with global context.
| ATTRIBUTE | DESCRIPTION |
|---|---|
gameweek |
Gameweek number
TYPE:
|
date |
Date of the gameweek
TYPE:
|
fixtures |
List of fixtures
TYPE:
|
team_difficulty |
Team-level difficulty ratings
TYPE:
|
Player
dataclass
¶
Player(
id: int,
name: str,
team: str,
position: str,
price: float,
timeseries: DataFrame,
news: Optional[dict] = None,
)
Represents a Fantasy Premier League player.
| ATTRIBUTE | DESCRIPTION |
|---|---|
id |
Unique player identifier
TYPE:
|
name |
Player full name
TYPE:
|
team |
Current team
TYPE:
|
position |
Position (GK, DEF, MID, FWD)
TYPE:
|
price |
Current price in FPL
TYPE:
|
timeseries |
Historical stats (points, xG, minutes, etc.)
TYPE:
|
news |
Latest news/injury information
TYPE:
|
FullSquad
dataclass
¶
FullSquad(
squad_players: list[Player],
lineup: Squad,
bench: list[Player] = list(),
squad_cost: float = 0.0,
expected_points: float = 0.0,
)
Represents a 15-player FPL squad with a selected 11-player lineup.
The two-level FPL structure: Level 1: 15-player squad (2 GK, 5 DEF, 5 MID, 3 FWD) under budget. Level 2: 11-player starting lineup chosen from the squad each gameweek.
| ATTRIBUTE | DESCRIPTION |
|---|---|
squad_players |
All 15 squad members.
TYPE:
|
lineup |
The 11-player starting lineup (subset of squad_players).
TYPE:
|
bench |
The 4 bench players.
TYPE:
|
squad_cost |
Total cost of all 15 players.
TYPE:
|
expected_points |
Expected points for the starting 11.
TYPE:
|
summary
¶
Returns a formatted string summary of the full squad.
Source code in fplx/core/squad.py
Squad
dataclass
¶
Squad(
players: list[Player],
formation: str,
total_cost: float,
expected_points: float,
captain: Optional[Player] = None,
)
Represents an 11-player starting lineup.
| ATTRIBUTE | DESCRIPTION |
|---|---|
players |
Selected starters (exactly 11).
TYPE:
|
formation |
Formation string (e.g., "3-4-3").
TYPE:
|
total_cost |
Total cost of the starting 11.
TYPE:
|
expected_points |
Expected total points for the starting 11.
TYPE:
|
captain |
Captain selection (earns double points).
TYPE:
|
summary
¶
Returns a formatted string summary of the lineup.
Source code in fplx/core/squad.py
FPLDataLoader
¶
Load and manage FPL data from various sources (API, CSV, cache).
| PARAMETER | DESCRIPTION |
|---|---|
cache_dir
|
Directory to cache downloaded data
TYPE:
|
Source code in fplx/data/loaders.py
fetch_bootstrap_data
¶
Fetch main FPL data (players, teams, gameweeks).
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh even if cached
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict
|
Bootstrap data containing players, teams, events |
Source code in fplx/data/loaders.py
load_players
¶
load_players(force_refresh: bool = False) -> list[Player]
Load all players with basic info.
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh from API
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
List of Player objects |
Source code in fplx/data/loaders.py
load_player_history
¶
Load detailed historical data for a specific player.
| PARAMETER | DESCRIPTION |
|---|---|
player_id
|
Player ID
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Historical gameweek stats |
Source code in fplx/data/loaders.py
load_fixtures
¶
Load all fixtures.
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Fixtures data |
Source code in fplx/data/loaders.py
load_from_csv
¶
Load data from CSV file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to CSV file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Loaded data |
Source code in fplx/data/loaders.py
enrich_player_history
¶
Enrich players with full historical data.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
List of players to enrich
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
Players with enriched timeseries |
Source code in fplx/data/loaders.py
api
¶
API module.
FPLModel
¶
FPLModel(
budget: float = 100.0,
horizon: int = 1,
formation: str = "auto",
config: Optional[dict] = None,
)
High-level interface for FPL analysis and squad optimization.
This is the main user-facing API. It orchestrates data loading, feature engineering, model fitting, and squad optimization.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Maximum squad budget (default 100.0)
TYPE:
|
horizon
|
Prediction horizon in gameweeks (default 1)
TYPE:
|
formation
|
Desired formation, or "auto" for optimization
TYPE:
|
config
|
Custom configuration
TYPE:
|
Examples:
>>> from fplx import FPLModel
>>> model = FPLModel(budget=100, horizon=1)
>>> model.load_data()
>>> model.fit()
>>> squad = model.select_best_11()
>>> squad.summary()
Source code in fplx/api/interface.py
load_data
¶
Load player and fixture data.
| PARAMETER | DESCRIPTION |
|---|---|
source
|
Data source: 'api' or 'local'
TYPE:
|
path
|
Path to local data (if source is 'local')
TYPE:
|
Source code in fplx/api/interface.py
fit
¶
Fit the prediction model.
Uses the probabilistic inference pipeline (HMM + Kalman + Fusion) when model_type is 'inference'. Falls back to the original feature engineering pipeline for baseline/ML models.
Source code in fplx/api/interface.py
select_best_11
¶
select_best_11() -> FullSquad
Select the optimal 15-player squad and 11-player starting lineup.
| RETURNS | DESCRIPTION |
|---|---|
FullSquad
|
The optimized squad with lineup. |
Source code in fplx/api/interface.py
interface
¶
High-level API interface for FPLX.
FPLModel
¶
FPLModel(
budget: float = 100.0,
horizon: int = 1,
formation: str = "auto",
config: Optional[dict] = None,
)
High-level interface for FPL analysis and squad optimization.
This is the main user-facing API. It orchestrates data loading, feature engineering, model fitting, and squad optimization.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Maximum squad budget (default 100.0)
TYPE:
|
horizon
|
Prediction horizon in gameweeks (default 1)
TYPE:
|
formation
|
Desired formation, or "auto" for optimization
TYPE:
|
config
|
Custom configuration
TYPE:
|
Examples:
>>> from fplx import FPLModel
>>> model = FPLModel(budget=100, horizon=1)
>>> model.load_data()
>>> model.fit()
>>> squad = model.select_best_11()
>>> squad.summary()
Source code in fplx/api/interface.py
load_data
¶
Load player and fixture data.
| PARAMETER | DESCRIPTION |
|---|---|
source
|
Data source: 'api' or 'local'
TYPE:
|
path
|
Path to local data (if source is 'local')
TYPE:
|
Source code in fplx/api/interface.py
fit
¶
Fit the prediction model.
Uses the probabilistic inference pipeline (HMM + Kalman + Fusion) when model_type is 'inference'. Falls back to the original feature engineering pipeline for baseline/ML models.
Source code in fplx/api/interface.py
select_best_11
¶
select_best_11() -> FullSquad
Select the optimal 15-player squad and 11-player starting lineup.
| RETURNS | DESCRIPTION |
|---|---|
FullSquad
|
The optimized squad with lineup. |
Source code in fplx/api/interface.py
core
¶
Matchweek
dataclass
¶
Matchweek(
gameweek: int,
date: datetime,
fixtures: list[dict],
team_difficulty: dict[str, float],
)
Represents a matchweek with global context.
| ATTRIBUTE | DESCRIPTION |
|---|---|
gameweek |
Gameweek number
TYPE:
|
date |
Date of the gameweek
TYPE:
|
fixtures |
List of fixtures
TYPE:
|
team_difficulty |
Team-level difficulty ratings
TYPE:
|
Player
dataclass
¶
Player(
id: int,
name: str,
team: str,
position: str,
price: float,
timeseries: DataFrame,
news: Optional[dict] = None,
)
Represents a Fantasy Premier League player.
| ATTRIBUTE | DESCRIPTION |
|---|---|
id |
Unique player identifier
TYPE:
|
name |
Player full name
TYPE:
|
team |
Current team
TYPE:
|
position |
Position (GK, DEF, MID, FWD)
TYPE:
|
price |
Current price in FPL
TYPE:
|
timeseries |
Historical stats (points, xG, minutes, etc.)
TYPE:
|
news |
Latest news/injury information
TYPE:
|
FullSquad
dataclass
¶
FullSquad(
squad_players: list[Player],
lineup: Squad,
bench: list[Player] = list(),
squad_cost: float = 0.0,
expected_points: float = 0.0,
)
Represents a 15-player FPL squad with a selected 11-player lineup.
The two-level FPL structure: Level 1: 15-player squad (2 GK, 5 DEF, 5 MID, 3 FWD) under budget. Level 2: 11-player starting lineup chosen from the squad each gameweek.
| ATTRIBUTE | DESCRIPTION |
|---|---|
squad_players |
All 15 squad members.
TYPE:
|
lineup |
The 11-player starting lineup (subset of squad_players).
TYPE:
|
bench |
The 4 bench players.
TYPE:
|
squad_cost |
Total cost of all 15 players.
TYPE:
|
expected_points |
Expected points for the starting 11.
TYPE:
|
summary
¶
Returns a formatted string summary of the full squad.
Source code in fplx/core/squad.py
Squad
dataclass
¶
Squad(
players: list[Player],
formation: str,
total_cost: float,
expected_points: float,
captain: Optional[Player] = None,
)
Represents an 11-player starting lineup.
| ATTRIBUTE | DESCRIPTION |
|---|---|
players |
Selected starters (exactly 11).
TYPE:
|
formation |
Formation string (e.g., "3-4-3").
TYPE:
|
total_cost |
Total cost of the starting 11.
TYPE:
|
expected_points |
Expected total points for the starting 11.
TYPE:
|
captain |
Captain selection (earns double points).
TYPE:
|
summary
¶
Returns a formatted string summary of the lineup.
Source code in fplx/core/squad.py
matchweek
¶
Matchweek domain object.
Matchweek
dataclass
¶
Matchweek(
gameweek: int,
date: datetime,
fixtures: list[dict],
team_difficulty: dict[str, float],
)
Represents a matchweek with global context.
| ATTRIBUTE | DESCRIPTION |
|---|---|
gameweek |
Gameweek number
TYPE:
|
date |
Date of the gameweek
TYPE:
|
fixtures |
List of fixtures
TYPE:
|
team_difficulty |
Team-level difficulty ratings
TYPE:
|
player
¶
Player domain object.
Player
dataclass
¶
Player(
id: int,
name: str,
team: str,
position: str,
price: float,
timeseries: DataFrame,
news: Optional[dict] = None,
)
Represents a Fantasy Premier League player.
| ATTRIBUTE | DESCRIPTION |
|---|---|
id |
Unique player identifier
TYPE:
|
name |
Player full name
TYPE:
|
team |
Current team
TYPE:
|
position |
Position (GK, DEF, MID, FWD)
TYPE:
|
price |
Current price in FPL
TYPE:
|
timeseries |
Historical stats (points, xG, minutes, etc.)
TYPE:
|
news |
Latest news/injury information
TYPE:
|
squad
¶
Squad and FullSquad domain objects.
Squad
dataclass
¶
Squad(
players: list[Player],
formation: str,
total_cost: float,
expected_points: float,
captain: Optional[Player] = None,
)
Represents an 11-player starting lineup.
| ATTRIBUTE | DESCRIPTION |
|---|---|
players |
Selected starters (exactly 11).
TYPE:
|
formation |
Formation string (e.g., "3-4-3").
TYPE:
|
total_cost |
Total cost of the starting 11.
TYPE:
|
expected_points |
Expected total points for the starting 11.
TYPE:
|
captain |
Captain selection (earns double points).
TYPE:
|
summary
¶
Returns a formatted string summary of the lineup.
Source code in fplx/core/squad.py
FullSquad
dataclass
¶
FullSquad(
squad_players: list[Player],
lineup: Squad,
bench: list[Player] = list(),
squad_cost: float = 0.0,
expected_points: float = 0.0,
)
Represents a 15-player FPL squad with a selected 11-player lineup.
The two-level FPL structure: Level 1: 15-player squad (2 GK, 5 DEF, 5 MID, 3 FWD) under budget. Level 2: 11-player starting lineup chosen from the squad each gameweek.
| ATTRIBUTE | DESCRIPTION |
|---|---|
squad_players |
All 15 squad members.
TYPE:
|
lineup |
The 11-player starting lineup (subset of squad_players).
TYPE:
|
bench |
The 4 bench players.
TYPE:
|
squad_cost |
Total cost of all 15 players.
TYPE:
|
expected_points |
Expected points for the starting 11.
TYPE:
|
summary
¶
Returns a formatted string summary of the full squad.
Source code in fplx/core/squad.py
data
¶
Data loading and schema definitions.
FPLDataLoader
¶
Load and manage FPL data from various sources (API, CSV, cache).
| PARAMETER | DESCRIPTION |
|---|---|
cache_dir
|
Directory to cache downloaded data
TYPE:
|
Source code in fplx/data/loaders.py
fetch_bootstrap_data
¶
Fetch main FPL data (players, teams, gameweeks).
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh even if cached
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict
|
Bootstrap data containing players, teams, events |
Source code in fplx/data/loaders.py
load_players
¶
load_players(force_refresh: bool = False) -> list[Player]
Load all players with basic info.
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh from API
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
List of Player objects |
Source code in fplx/data/loaders.py
load_player_history
¶
Load detailed historical data for a specific player.
| PARAMETER | DESCRIPTION |
|---|---|
player_id
|
Player ID
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Historical gameweek stats |
Source code in fplx/data/loaders.py
load_fixtures
¶
Load all fixtures.
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Fixtures data |
Source code in fplx/data/loaders.py
load_from_csv
¶
Load data from CSV file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to CSV file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Loaded data |
Source code in fplx/data/loaders.py
enrich_player_history
¶
Enrich players with full historical data.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
List of players to enrich
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
Players with enriched timeseries |
Source code in fplx/data/loaders.py
VaastavLoader
¶
VaastavLoader(
season: str = "2023-24",
data_dir: Optional[str | Path] = None,
cache_dir: Optional[str | Path] = None,
)
Load historical FPL data from the vaastav dataset.
| PARAMETER | DESCRIPTION |
|---|---|
season
|
Season string, e.g. "2023-24".
TYPE:
|
data_dir
|
Path to a local clone. If None, fetches from GitHub.
TYPE:
|
cache_dir
|
Where to cache downloaded CSVs. Defaults to ~/.fplx/vaastav/.
TYPE:
|
Source code in fplx/data/vaastav_loader.py
load_merged_gw
¶
Load the merged gameweek file (all GWs, all players, one CSV).
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
One row per player-gameweek appearance. |
Source code in fplx/data/vaastav_loader.py
load_player_raw
¶
Load season-level player metadata.
load_gameweek
¶
build_player_objects
¶
build_player_objects(
up_to_gw: Optional[int] = None,
) -> list[Player]
Build Player objects with timeseries up to a given gameweek.
| PARAMETER | DESCRIPTION |
|---|---|
up_to_gw
|
Only include gameweeks 1..up_to_gw. If None, include all.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
|
Source code in fplx/data/vaastav_loader.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
get_actual_points
¶
Get actual points scored by each player in a specific gameweek.
For Double Gameweek players (two fixtures in the same round) the
points from both fixtures are summed, which is the correct FPL
score for that gameweek. The previous implementation used dict(zip(…))
which silently discarded the first fixture row when a player appeared
twice, underreporting DGW scores.
| RETURNS | DESCRIPTION |
|---|---|
dict[int, float]
|
{player_id: actual_points} (summed across fixtures for DGW players) |
Source code in fplx/data/vaastav_loader.py
get_fixture_info
¶
Get fixture context (opponent, home/away, xP) per player for a GW.
Source code in fplx/data/vaastav_loader.py
double_gameweek
¶
Double Gameweek (DGW) detection, timeseries aggregation, and prediction scaling.
A Double Gameweek occurs when a team plays two Premier League fixtures in the same FPL gameweek. From the perspective of the inference pipeline and optimizer, this has two distinct effects:
-
Historical timeseries (training/inference input) The vaastav dataset stores each fixture as a separate row. A DGW player therefore has two rows sharing the same
gameweekvalue. If not aggregated, the HMM will see them as two sequential timesteps with single-game-calibrated emissions, causing the model to misinterpret a large total (e.g. 14 pts from two good games) as a single "Star" observation when it is actually two "Good" observations. -
Forward prediction (next-GW forecast for ILP) When the upcoming gameweek is a DGW, a player plays twice. Their expected FPL points should be approximately 2× the single-game prediction (under independence), and their variance should also scale accordingly.
Usage
from fplx.data.double_gameweek import ( ... detect_dgw_gameweeks, ... aggregate_dgw_timeseries, ... scale_predictions_for_dgw, ... get_fixture_counts_from_bootstrap, ... )
detect_dgw_gameweeks
¶
Return a mapping of {gameweek: n_fixtures} for a single player's timeseries.
A gameweek with n_fixtures > 1 is a Double (or Triple) Gameweek.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
Per-fixture timeseries as returned by
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[int, int]
|
|
Examples:
>>> counts = detect_dgw_gameweeks(player.timeseries)
>>> dgw_gws = [gw for gw, n in counts.items() if n > 1]
Source code in fplx/data/double_gameweek.py
aggregate_dgw_timeseries
¶
Collapse per-fixture rows into one normalised row per gameweek.
This is the single place where Double Gameweek handling lives. All downstream consumers (inference pipeline, enriched predictor, MV-HMM, Kalman Filter) always receive exactly one row per FPL decision period and never need to be aware of DGWs.
For a DGW gameweek (n_fixtures == 2):
- Additive stats (goals, minutes, bonus, …) are summed to reflect the total accumulated across both matches.
- Per-fixture normalisation is applied to
pointsand to every additive stat that forms an inference feature. The normalised column is stored alongside the raw total:
.. code-block:: text
points # raw total (used for scoring / oracle)
points_norm # per-fixture average (used by inference / HMM)
The HMM emission distributions are calibrated on points_norm, so a
DGW observation of 10 total points (points_norm = 5) is correctly
interpreted as an "Average" game rather than misidentified as a "Star"
event (8.5 pts single-game emission mean).
-
Rate / expected stats (xG, xA, …) are averaged — they already represent per-match rates.
-
Context columns (price, opponent) take the last-fixture value.
For a single-fixture gameweek (n_fixtures == 1) the row is returned
unchanged and points_norm == points.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
Raw per-fixture timeseries (may contain duplicate
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
One row per gameweek, sorted ascending by |
Source code in fplx/data/double_gameweek.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | |
scale_predictions_for_dgw
¶
scale_predictions_for_dgw(
expected_points: dict[int, float],
variances: dict[int, float],
downside_risks: dict[int, float],
fixture_counts: dict[int, int],
variance_mode: str = "additive",
) -> tuple[
dict[int, float], dict[int, float], dict[int, float]
]
Scale single-game predictions to account for a Double Gameweek.
For a player with n fixtures in the upcoming gameweek:
- Expected points:
E[P_total] = n * E[P_single] - Variance (additive, under independence):
Var[P_total] = n * Var[P_single] - Downside risk:
DR_total = sqrt(n) * DR_single
This is exact under independence of the two match performances. The independence assumption is acceptable because FPL points in different matches are only weakly correlated (shared clean sheet probability for the same game counts for both defenders, but that is captured in the single-game variance estimate).
| PARAMETER | DESCRIPTION |
|---|---|
expected_points
|
Single-game expected points per player id.
TYPE:
|
variances
|
Single-game predictive variance per player id.
TYPE:
|
downside_risks
|
Single-game semi-deviation per player id.
TYPE:
|
fixture_counts
|
Number of upcoming fixtures per player id (1 for SGW, 2 for DGW). Players absent from this dict are assumed to have 1 fixture.
TYPE:
|
variance_mode
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ep_scaled, var_scaled, dr_scaled : tuple of dicts
|
Scaled prediction dicts with the same keys as the inputs. |
Notes
Blank gameweek (BGW) players (n = 0) receive E[P] = 0,
Var[P] = 0.1, DR = 0. The optimizer will naturally exclude them
since their expected points are zero.
Examples:
>>> ep_scaled, var_scaled, dr_scaled = scale_predictions_for_dgw(
... expected_points, variances, downside_risks, fixture_counts
... )
Source code in fplx/data/double_gameweek.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 | |
get_fixture_counts_from_bootstrap
¶
Derive per-player fixture counts for a gameweek from FPL bootstrap data.
Parses the fixtures list in the bootstrap-static response to count how
many fixtures each team plays in target_gw. Returns a player-level
mapping derived from each player's team id.
| PARAMETER | DESCRIPTION |
|---|---|
bootstrap
|
Full bootstrap-static API response containing
TYPE:
|
target_gw
|
The gameweek to inspect.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[int, int]
|
|
Source code in fplx/data/double_gameweek.py
get_fixture_counts_from_vaastav
¶
Derive per-player fixture counts for a historical gameweek from vaastav data.
Uses the merged_gw CSV to count how many rows each player has for
target_gw. This is the ground-truth fixture count for backtesting.
| PARAMETER | DESCRIPTION |
|---|---|
loader
|
An initialised loader instance.
TYPE:
|
target_gw
|
Gameweek to inspect.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[int, int]
|
|
Source code in fplx/data/double_gameweek.py
loaders
¶
Data loaders for FPL data sources.
FPLDataLoader
¶
Load and manage FPL data from various sources (API, CSV, cache).
| PARAMETER | DESCRIPTION |
|---|---|
cache_dir
|
Directory to cache downloaded data
TYPE:
|
Source code in fplx/data/loaders.py
fetch_bootstrap_data
¶
Fetch main FPL data (players, teams, gameweeks).
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh even if cached
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict
|
Bootstrap data containing players, teams, events |
Source code in fplx/data/loaders.py
load_players
¶
load_players(force_refresh: bool = False) -> list[Player]
Load all players with basic info.
| PARAMETER | DESCRIPTION |
|---|---|
force_refresh
|
Force refresh from API
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
List of Player objects |
Source code in fplx/data/loaders.py
load_player_history
¶
Load detailed historical data for a specific player.
| PARAMETER | DESCRIPTION |
|---|---|
player_id
|
Player ID
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Historical gameweek stats |
Source code in fplx/data/loaders.py
load_fixtures
¶
Load all fixtures.
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Fixtures data |
Source code in fplx/data/loaders.py
load_from_csv
¶
Load data from CSV file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to CSV file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Loaded data |
Source code in fplx/data/loaders.py
enrich_player_history
¶
Enrich players with full historical data.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
List of players to enrich
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
Players with enriched timeseries |
Source code in fplx/data/loaders.py
news_collector
¶
News collection and per-gameweek persistence.
NewsSnapshot
¶
NewsSnapshot(
player_id: int,
gameweek: int,
news_text: str = "",
status: str = "a",
chance_this_round: Optional[float] = None,
chance_next_round: Optional[float] = None,
timestamp: str = "",
)
A single player's news state at a specific gameweek.
| ATTRIBUTE | DESCRIPTION |
|---|---|
player_id |
TYPE:
|
gameweek |
TYPE:
|
news_text |
Raw news string from FPL API.
TYPE:
|
status |
FPL status code: "a", "d", "i", "s", "u", "n".
TYPE:
|
chance_this_round |
Probability of playing this round (0-100 scale from API, stored as 0-1).
TYPE:
|
chance_next_round |
Probability of playing next round (0-1).
TYPE:
|
timestamp |
When the news was added (ISO format from API).
TYPE:
|
Source code in fplx/data/news_collector.py
to_news_signal_input
¶
Convert to the text format that NewsSignal.generate_signal() expects.
Combines the raw news text with status information to give the existing NewsParser richer input.
Source code in fplx/data/news_collector.py
NewsCollector
¶
Collects and persists player news snapshots per gameweek.
Usage (live): collector = NewsCollector(cache_dir="~/.fplx/news") collector.collect_from_bootstrap(bootstrap_data, gameweek=25) # Later, feed into inference: snapshots = collector.get_player_history(player_id=123)
Usage (backtest): collector = NewsCollector(cache_dir="~/.fplx/news") # Load all pre-collected snapshots for gw in range(1, 39): snapshots = collector.get_gameweek(gw) # inject into pipeline per player
| PARAMETER | DESCRIPTION |
|---|---|
cache_dir
|
Directory to persist snapshots as JSON.
TYPE:
|
Source code in fplx/data/news_collector.py
collect_from_bootstrap
¶
Extract news from a bootstrap-static API response.
This is the key method. Call it each gameweek with fresh API data.
| PARAMETER | DESCRIPTION |
|---|---|
bootstrap_data
|
Response from https://fantasy.premierleague.com/api/bootstrap-static/
TYPE:
|
gameweek
|
Current gameweek number.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of players with active news. |
Source code in fplx/data/news_collector.py
get_player_news
¶
get_player_news(
player_id: int, gameweek: int
) -> Optional[NewsSnapshot]
Get a specific player's news at a specific gameweek.
Source code in fplx/data/news_collector.py
get_player_history
¶
get_player_history(player_id: int) -> list[NewsSnapshot]
Get all news snapshots for a player across all collected gameweeks.
Returns list sorted by gameweek.
Source code in fplx/data/news_collector.py
get_gameweek
¶
get_gameweek(gameweek: int) -> dict[int, NewsSnapshot]
get_players_with_news
¶
get_players_with_news(gameweek: int) -> list[NewsSnapshot]
Get only players with non-trivial news at a gameweek.
Source code in fplx/data/news_collector.py
collect_season_from_api
¶
Collect news for all gameweeks in a season.
Requires calling the FPL API once per gameweek (the bootstrap-static endpoint only gives current-week news). For backtesting, you'd need to have cached the bootstrap data weekly during the season.
For a single-shot collection (current state only), just call collect_from_bootstrap() once with the current bootstrap data and the current gameweek number.
| PARAMETER | DESCRIPTION |
|---|---|
data_loader
|
Your existing data loader.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of gameweeks collected. |
Source code in fplx/data/news_collector.py
schemas
¶
Data validation schemas for FPL data sources.
tft_dataset
¶
Dataset utilities for Temporal Fusion Transformer (TFT).
This module converts vaastav merged gameweek data into a global panel format
compatible with pytorch_forecasting.TimeSeriesDataSet.
build_tft_panel
¶
Build TFT panel dataframe from merged gameweek data.
Output schema includes: - group_id: player identifier - time_idx: gameweek index - static categoricals: position, team - known covariates: fixture_difficulty, is_home - unknown covariates: xPts, mins_frac, news_sentiment, actual_points
Source code in fplx/data/tft_dataset.py
make_tft_datasets
¶
make_tft_datasets(
panel_df: DataFrame,
training_cutoff: int,
encoder_length: int = 15,
prediction_length: int = 1,
)
Create TFT training and prediction datasets.
Requires optional dependency pytorch-forecasting.
Source code in fplx/data/tft_dataset.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 | |
vaastav_loader
¶
Loader for the vaastav/Fantasy-Premier-League dataset.
Supports two modes: 1. Remote: fetch CSVs directly from GitHub (no clone needed). 2. Local: read from a cloned repo directory.
Usage (remote): loader = VaastavLoader(season="2023-24") players = loader.build_player_objects(up_to_gw=20)
Usage (local): loader = VaastavLoader(season="2023-24", data_dir="./Fantasy-Premier-League") players = loader.build_player_objects(up_to_gw=20)
Dataset: https://github.com/vaastav/Fantasy-Premier-League
Double Gameweek handling
build_player_objects automatically calls
aggregate_dgw_timeseries on every player's raw timeseries before
constructing the Player object. This means all downstream consumers
(inference pipeline, MV-HMM, enriched predictor, Kalman Filter) always
receive exactly one row per FPL decision period.
For DGW gameweeks, the resulting row contains:
points – raw total (both fixtures summed, used for scoring / oracle)
points_norm – per-fixture average (used by inference components)
n_fixtures – number of fixtures played (1 for SGW, 2 for DGW)
The inference pipeline uses points_norm so that HMM emission distributions
remain calibrated on single-game-equivalent observations. The ILP objective
then scales back via scale_predictions_for_dgw to reflect the full DGW
opportunity.
VaastavLoader
¶
VaastavLoader(
season: str = "2023-24",
data_dir: Optional[str | Path] = None,
cache_dir: Optional[str | Path] = None,
)
Load historical FPL data from the vaastav dataset.
| PARAMETER | DESCRIPTION |
|---|---|
season
|
Season string, e.g. "2023-24".
TYPE:
|
data_dir
|
Path to a local clone. If None, fetches from GitHub.
TYPE:
|
cache_dir
|
Where to cache downloaded CSVs. Defaults to ~/.fplx/vaastav/.
TYPE:
|
Source code in fplx/data/vaastav_loader.py
load_merged_gw
¶
Load the merged gameweek file (all GWs, all players, one CSV).
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
One row per player-gameweek appearance. |
Source code in fplx/data/vaastav_loader.py
load_player_raw
¶
Load season-level player metadata.
load_gameweek
¶
build_player_objects
¶
build_player_objects(
up_to_gw: Optional[int] = None,
) -> list[Player]
Build Player objects with timeseries up to a given gameweek.
| PARAMETER | DESCRIPTION |
|---|---|
up_to_gw
|
Only include gameweeks 1..up_to_gw. If None, include all.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Player]
|
|
Source code in fplx/data/vaastav_loader.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
get_actual_points
¶
Get actual points scored by each player in a specific gameweek.
For Double Gameweek players (two fixtures in the same round) the
points from both fixtures are summed, which is the correct FPL
score for that gameweek. The previous implementation used dict(zip(…))
which silently discarded the first fixture row when a player appeared
twice, underreporting DGW scores.
| RETURNS | DESCRIPTION |
|---|---|
dict[int, float]
|
{player_id: actual_points} (summed across fixtures for DGW players) |
Source code in fplx/data/vaastav_loader.py
get_fixture_info
¶
Get fixture context (opponent, home/away, xP) per player for a GW.
Source code in fplx/data/vaastav_loader.py
evaluation
¶
Evaluation metrics for inference and optimization.
InferenceMetrics
dataclass
¶
InferenceMetrics(
predicted_means: list[float] = list(),
predicted_vars: list[float] = list(),
actuals: list[float] = list(),
model_predictions: dict[str, list[float]] = dict(),
)
Collects and computes inference evaluation metrics.
Usage: metrics = InferenceMetrics() for each player-gameweek: metrics.add(predicted_mean, predicted_var, actual_points) report = metrics.compute()
add
¶
add(
predicted_mean: float,
predicted_var: float,
actual: float,
model_preds: dict[str, float] | None = None,
)
Record a single prediction-actual pair.
Source code in fplx/evaluation/metrics.py
compute
¶
Compute all inference metrics.
Source code in fplx/evaluation/metrics.py
OptimizationMetrics
dataclass
¶
OptimizationMetrics(
strategy_points: dict[str, list[float]] = dict(),
oracle_points: list[float] = list(),
gameweeks: list[int] = list(),
)
Collects and computes optimization evaluation metrics.
Tracks actual points earned per gameweek under different strategies, and compares against oracle (hindsight-optimal).
Usage: metrics = OptimizationMetrics() for each gameweek: metrics.add_gameweek(gw, actual_points, oracle_points) report = metrics.compute()
add_gameweek
¶
Record actual points for one gameweek across strategies.
| PARAMETER | DESCRIPTION |
|---|---|
gw
|
Gameweek number.
TYPE:
|
strategy_results
|
{strategy_name: actual_points_earned}
TYPE:
|
oracle
|
Best possible points with hindsight.
TYPE:
|
Source code in fplx/evaluation/metrics.py
compute
¶
Compute optimization metrics for all strategies.
Source code in fplx/evaluation/metrics.py
metrics
¶
Metrics for evaluating inference accuracy and optimization quality.
Part I (18-662) metrics: prediction accuracy, calibration, ablation. Part II (18-660) metrics: actual points, optimality gap, consistency.
InferenceMetrics
dataclass
¶
InferenceMetrics(
predicted_means: list[float] = list(),
predicted_vars: list[float] = list(),
actuals: list[float] = list(),
model_predictions: dict[str, list[float]] = dict(),
)
Collects and computes inference evaluation metrics.
Usage: metrics = InferenceMetrics() for each player-gameweek: metrics.add(predicted_mean, predicted_var, actual_points) report = metrics.compute()
add
¶
add(
predicted_mean: float,
predicted_var: float,
actual: float,
model_preds: dict[str, float] | None = None,
)
Record a single prediction-actual pair.
Source code in fplx/evaluation/metrics.py
compute
¶
Compute all inference metrics.
Source code in fplx/evaluation/metrics.py
OptimizationMetrics
dataclass
¶
OptimizationMetrics(
strategy_points: dict[str, list[float]] = dict(),
oracle_points: list[float] = list(),
gameweeks: list[int] = list(),
)
Collects and computes optimization evaluation metrics.
Tracks actual points earned per gameweek under different strategies, and compares against oracle (hindsight-optimal).
Usage: metrics = OptimizationMetrics() for each gameweek: metrics.add_gameweek(gw, actual_points, oracle_points) report = metrics.compute()
add_gameweek
¶
Record actual points for one gameweek across strategies.
| PARAMETER | DESCRIPTION |
|---|---|
gw
|
Gameweek number.
TYPE:
|
strategy_results
|
{strategy_name: actual_points_earned}
TYPE:
|
oracle
|
Best possible points with hindsight.
TYPE:
|
Source code in fplx/evaluation/metrics.py
compute
¶
Compute optimization metrics for all strategies.
Source code in fplx/evaluation/metrics.py
inference
¶
Probabilistic inference modules for FPLX.
HMMInference
¶
HMMInference(
transition_matrix: Optional[ndarray] = None,
emission_params: Optional[dict] = None,
initial_dist: Optional[ndarray] = None,
)
Hidden Markov Model for discrete player form states.
Supports dynamic transition matrix perturbation so that external signals (news, injuries) can shift state probabilities mid-sequence.
| PARAMETER | DESCRIPTION |
|---|---|
transition_matrix
|
transition_matrix[i,j] = P(S_{t+1}=j | S_t=i). Rows must sum to 1.
TYPE:
|
emission_params
|
{state_index: (mean, std)} for Gaussian emissions.
TYPE:
|
initial_dist
|
Prior over initial state.
TYPE:
|
Source code in fplx/inference/hmm.py
inject_news_perturbation
¶
Perturb transition matrix at a specific timestep based on news.
For each source state, the transition probability toward boosted target states is multiplied by the boost factor (scaled by confidence), then the row is renormalized.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
The gameweek at which the perturbation applies.
TYPE:
|
state_boost
|
{target_state: multiplicative_boost}. E.g., {0: 10.0} means "10x more likely to transition to Injured."
TYPE:
|
confidence
|
Scales the perturbation. 0 = no effect, 1 = full effect.
TYPE:
|
Source code in fplx/inference/hmm.py
clear_perturbations
¶
forward
¶
Forward algorithm with dynamic transition matrices.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
forward_messages
|
Normalized forward messages. forward_messages[t] = P(S_t | y_1:t)
TYPE:
|
scale
|
Per-timestep normalization constants.
TYPE:
|
Source code in fplx/inference/hmm.py
forward_backward
¶
Compute smoothed posteriors P(S_t | y_1:num_timesteps).
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
smoothed_posteriors
|
smoothed_posteriors[t, s] = P(S_t=s | y_1:num_timesteps)
TYPE:
|
Source code in fplx/inference/hmm.py
viterbi
¶
Most likely state sequence via Viterbi decoding.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
best_path
|
TYPE:
|
Source code in fplx/inference/hmm.py
predict_next
¶
Predict next timestep's points distribution.
Runs forward algorithm, then propagates one step ahead via the transition matrix.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
E[Y_{num_timesteps+1} | y_1:num_timesteps]
TYPE:
|
variance
|
Var[Y_{num_timesteps+1} | y_1:num_timesteps] (from law of total variance)
TYPE:
|
next_state_dist
|
P(S_{num_timesteps+1} | y_1:num_timesteps)
TYPE:
|
Source code in fplx/inference/hmm.py
fit
¶
Learn transition matrix and emission parameters via Baum-Welch EM.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
Training sequence.
TYPE:
|
n_iter
|
Maximum EM iterations.
TYPE:
|
tol
|
Convergence tolerance on log-likelihood.
TYPE:
|
verbose
|
Print progress.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
self
|
|
Source code in fplx/inference/hmm.py
283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 | |
KalmanFilter
¶
KalmanFilter(
process_noise: float = 1.0,
observation_noise: float = 4.0,
initial_state_mean: float = 4.0,
initial_state_covariance: float = 2.0,
)
1D Kalman Filter for tracking latent point potential.
| PARAMETER | DESCRIPTION |
|---|---|
process_noise
|
Default process noise variance (form drift rate).
TYPE:
|
observation_noise
|
Default observation noise variance (weekly point noise).
TYPE:
|
initial_state_mean
|
Initial state estimate.
TYPE:
|
initial_state_covariance
|
Initial state uncertainty (variance).
TYPE:
|
Source code in fplx/inference/kalman.py
inject_process_shock
¶
Inflate process noise at a specific timestep.
Use when news indicates a sudden form change (injury, transfer). process_noise_t = default_process_noise * multiplier.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
Gameweek index.
TYPE:
|
multiplier
|
Process noise multiplier (>1 = more uncertainty about form drift).
TYPE:
|
Source code in fplx/inference/kalman.py
inject_observation_noise
¶
Adjust observation noise at a specific timestep.
Use for fixture difficulty: harder opponents → less predictable points. observation_noise_t = default_observation_noise * factor.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
Gameweek index.
TYPE:
|
factor
|
Observation noise factor (>1 = harder fixture, noisier observation).
TYPE:
|
Source code in fplx/inference/kalman.py
clear_overrides
¶
get_process_noise_override
¶
set_noise_overrides
¶
set_noise_overrides(
process_noise_overrides: dict[int, float],
observation_noise_overrides: dict[int, float],
)
Replace per-timestep noise overrides.
Source code in fplx/inference/kalman.py
copy_with_overrides
¶
copy_with_overrides(
max_timestep: Optional[int] = None,
) -> KalmanFilter
Create a parameter-identical filter with copied noise overrides.
| PARAMETER | DESCRIPTION |
|---|---|
max_timestep
|
If provided, only overrides for timesteps <= max_timestep are copied.
TYPE:
|
Source code in fplx/inference/kalman.py
filter
¶
Run Kalman filter on observations with per-timestep noise.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
filtered_state_means
|
Filtered state estimates (posterior mean).
TYPE:
|
filtered_state_covariances
|
Filtered state uncertainties (posterior variance).
TYPE:
|
Source code in fplx/inference/kalman.py
predict_next
¶
Predict next observation with uncertainty.
Returns the predictive distribution for Y_{t+1} (the observation), not X_{t+1} (the latent state). This ensures consistency with the HMM predict_next which also returns observation-level variance.
Var[Y_{t+1}] = Var[X_{t+1}|y_{1:t}] + R = (P_t + Q) + R
Must call filter() first.
| RETURNS | DESCRIPTION |
|---|---|
predicted_mean
|
E[Y_{t+1} | y_{1:t}].
TYPE:
|
predicted_var
|
Var[Y_{t+1} | y_{1:t}] (observation-level, includes R).
TYPE:
|
Source code in fplx/inference/kalman.py
smooth
¶
Run RTS smoother (backward pass after forward Kalman filter).
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
smoothed_state_means
|
Smoothed state estimates.
TYPE:
|
smoothed_state_covariances
|
Smoothed state uncertainties.
TYPE:
|
Source code in fplx/inference/kalman.py
MultivariateHMM
¶
MultivariateHMM(
position: str = "MID",
transition_matrix: Optional[ndarray] = None,
initial_dist: Optional[ndarray] = None,
)
Position-aware HMM with multivariate diagonal Gaussian emissions.
| PARAMETER | DESCRIPTION |
|---|---|
position
|
GK, DEF, MID, FWD. Determines feature set and default emissions.
TYPE:
|
Source code in fplx/inference/multivariate_hmm.py
inject_news_perturbation
¶
Perturb transition matrix at timestep (same API as scalar HMM).
Source code in fplx/inference/multivariate_hmm.py
forward
¶
Forward algorithm. observations: (T, D).
Source code in fplx/inference/multivariate_hmm.py
forward_backward
¶
Smoothed posteriors P(S_t | y_{1:T}).
Source code in fplx/inference/multivariate_hmm.py
viterbi
¶
Most likely state sequence.
Source code in fplx/inference/multivariate_hmm.py
predict_next_features
¶
Predict next gameweek's feature vector.
Returns mean, var (per feature), and state distribution.
Source code in fplx/inference/multivariate_hmm.py
one_step_point_predictions
¶
One-step-ahead point predictions for each historical timestep.
Returns array preds where preds[t] predicts points at timestep t, using information up to t-1 (preds[0] is NaN).
Source code in fplx/inference/multivariate_hmm.py
predict_next_points
¶
Convert predicted features → expected FPL points.
Uses FPL scoring rules applied to predicted feature rates.
Source code in fplx/inference/multivariate_hmm.py
fit
¶
Baum-Welch EM with MAP-style prior interpolation.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
Feature matrix with shape (T, D).
TYPE:
|
n_iter
|
Maximum EM iterations.
TYPE:
|
tol
|
Convergence tolerance on log-likelihood.
TYPE:
|
prior_weight
|
Weight on prior parameters in [0, 1]. Higher values increase regularization toward position-level default emissions/transitions.
TYPE:
|
Source code in fplx/inference/multivariate_hmm.py
280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 | |
InferenceResult
dataclass
¶
InferenceResult(
filtered_beliefs: ndarray,
smoothed_beliefs: ndarray,
viterbi_path: ndarray,
hmm_predicted_mean: float = 0.0,
hmm_predicted_var: float = 0.0,
kalman_filtered: ndarray = (lambda: array([]))(),
kalman_uncertainty: ndarray = (lambda: array([]))(),
kf_predicted_mean: float = 0.0,
kf_predicted_var: float = 0.0,
fused_mean: ndarray = (lambda: array([]))(),
fused_var: ndarray = (lambda: array([]))(),
fusion_alpha: Optional[float] = None,
predicted_mean: float = 0.0,
predicted_var: float = 0.0,
)
Container for inference pipeline outputs.
PlayerInferencePipeline
¶
PlayerInferencePipeline(
hmm_params: Optional[dict] = None,
kf_params: Optional[dict] = None,
hmm_variance_floor: float = 1.0,
news_params: Optional[dict] = None,
fusion_mode: str = "precision",
fusion_params: Optional[dict] = None,
)
Orchestrates HMM + Kalman inference for a single player.
| PARAMETER | DESCRIPTION |
|---|---|
hmm_params
|
Override HMM parameters: transition_matrix, emission_params, initial_dist.
TYPE:
|
kf_params
|
Override Kalman parameters: Q, R, x0, P0.
TYPE:
|
Source code in fplx/inference/pipeline.py
ingest_observations
¶
Set the player's historical points sequence.
| PARAMETER | DESCRIPTION |
|---|---|
points
|
Weekly points history.
TYPE:
|
Source code in fplx/inference/pipeline.py
inject_news
¶
Inject a news signal into the inference at a specific gameweek.
Bridges from existing NewsSignal.generate_signal() output format.
| PARAMETER | DESCRIPTION |
|---|---|
news_signal
|
Output from NewsSignal.generate_signal(). Must contain: 'availability', 'minutes_risk', 'confidence'.
TYPE:
|
timestep
|
The gameweek index to apply the perturbation.
TYPE:
|
Source code in fplx/inference/pipeline.py
inject_fixture_difficulty
¶
Inject fixture difficulty into Kalman observation noise.
| PARAMETER | DESCRIPTION |
|---|---|
difficulty
|
Fixture difficulty score (1-5, from FixtureSignal).
TYPE:
|
timestep
|
The gameweek index.
TYPE:
|
Source code in fplx/inference/pipeline.py
run
¶
run() -> InferenceResult
Run full inference pipeline: HMM + Kalman + Fusion.
| RETURNS | DESCRIPTION |
|---|---|
InferenceResult
|
All inference outputs. |
Source code in fplx/inference/pipeline.py
predict_next
¶
Get the fused one-step-ahead forecast.
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
TYPE:
|
variance
|
TYPE:
|
Source code in fplx/inference/pipeline.py
learn_parameters
¶
Run Baum-Welch to learn HMM parameters from current observations.
Call this before run() if you want data-driven parameters.
Source code in fplx/inference/pipeline.py
batch_enriched_predict
¶
Run enriched prediction for all players. Returns ep, var, downside_risk dicts.
Source code in fplx/inference/enriched.py
compute_xpoints
¶
Compute per-GW expected points from ALL underlying components.
Source code in fplx/inference/enriched.py
enriched_predict
¶
Predict expected points with fixture awareness and semi-variance.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
TYPE:
|
position
|
TYPE:
|
alpha
|
EWMA decay.
TYPE:
|
lookback
|
Max recent GWs (increased from 10 to 15 for more data).
TYPE:
|
upcoming_fixture
|
{"was_home": bool, "opponent_team": int, "xP": float}
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
TYPE:
|
variance
|
TYPE:
|
downside_risk
|
TYPE:
|
Source code in fplx/inference/enriched.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | |
fuse_estimates
¶
fuse_estimates(
hmm_mean: float,
hmm_var: float,
kf_mean: float,
kf_var: float,
) -> tuple[float, float]
Fuse a single HMM estimate with a single Kalman estimate.
Uses inverse-variance weighting: fused_mean = (hmm_mean/hmm_var + kf_mean/kf_var) / (1/hmm_var + 1/kf_var) fused_var = 1 / (1/hmm_var + 1/kf_var)
| PARAMETER | DESCRIPTION |
|---|---|
hmm_mean
|
HMM expected points (from state posterior weighted emission means).
TYPE:
|
hmm_var
|
HMM variance (law of total variance over state posterior).
TYPE:
|
kf_mean
|
Kalman filtered point estimate.
TYPE:
|
kf_var
|
Kalman filtered uncertainty (posterior variance).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fused_mean
|
TYPE:
|
fused_var
|
TYPE:
|
Source code in fplx/inference/fusion.py
fuse_sequences
¶
fuse_sequences(
hmm_gamma: ndarray,
kalman_x: ndarray,
kalman_P: ndarray,
emission_params: dict,
) -> tuple[ndarray, ndarray]
Fuse full sequences of HMM posteriors and Kalman estimates.
| PARAMETER | DESCRIPTION |
|---|---|
hmm_gamma
|
Smoothed state posteriors from HMM.
TYPE:
|
kalman_x
|
Kalman filtered estimates.
TYPE:
|
kalman_P
|
Kalman filtered uncertainties.
TYPE:
|
emission_params
|
{state_index: (mean, std)} from HMM.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fused_mean
|
TYPE:
|
fused_var
|
TYPE:
|
Source code in fplx/inference/fusion.py
build_feature_matrix
¶
Extract position-specific feature matrix from player timeseries.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
Player gameweek history from vaastav dataset.
TYPE:
|
position
|
GK, DEF, MID, or FWD.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
np.ndarray, shape (T, D) where D depends on position.
|
|
Source code in fplx/inference/multivariate_hmm.py
enriched
¶
Fixture-aware enriched prediction with semi-variance for downside risk.
Improvements over base enriched: - Cards, own goals, penalties (negative pts previously unmodeled) - Home/away adjustment from player history - Opponent strength adjustment from player history - Ensemble with FPL's xP when available - Semi-variance: only penalize downside deviation below E[P] - Longer lookback with exponential decay (more data, recency bias)
compute_xpoints
¶
Compute per-GW expected points from ALL underlying components.
Source code in fplx/inference/enriched.py
enriched_predict
¶
Predict expected points with fixture awareness and semi-variance.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
TYPE:
|
position
|
TYPE:
|
alpha
|
EWMA decay.
TYPE:
|
lookback
|
Max recent GWs (increased from 10 to 15 for more data).
TYPE:
|
upcoming_fixture
|
{"was_home": bool, "opponent_team": int, "xP": float}
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
TYPE:
|
variance
|
TYPE:
|
downside_risk
|
TYPE:
|
Source code in fplx/inference/enriched.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | |
batch_enriched_predict
¶
Run enriched prediction for all players. Returns ep, var, downside_risk dicts.
Source code in fplx/inference/enriched.py
fusion
¶
Fusion of HMM and Kalman Filter outputs.
Combines discrete state posteriors (HMM) with continuous estimates (Kalman) using inverse-variance weighting — optimal under Gaussian independence.
fuse_estimates
¶
fuse_estimates(
hmm_mean: float,
hmm_var: float,
kf_mean: float,
kf_var: float,
) -> tuple[float, float]
Fuse a single HMM estimate with a single Kalman estimate.
Uses inverse-variance weighting: fused_mean = (hmm_mean/hmm_var + kf_mean/kf_var) / (1/hmm_var + 1/kf_var) fused_var = 1 / (1/hmm_var + 1/kf_var)
| PARAMETER | DESCRIPTION |
|---|---|
hmm_mean
|
HMM expected points (from state posterior weighted emission means).
TYPE:
|
hmm_var
|
HMM variance (law of total variance over state posterior).
TYPE:
|
kf_mean
|
Kalman filtered point estimate.
TYPE:
|
kf_var
|
Kalman filtered uncertainty (posterior variance).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fused_mean
|
TYPE:
|
fused_var
|
TYPE:
|
Source code in fplx/inference/fusion.py
fuse_sequences
¶
fuse_sequences(
hmm_gamma: ndarray,
kalman_x: ndarray,
kalman_P: ndarray,
emission_params: dict,
) -> tuple[ndarray, ndarray]
Fuse full sequences of HMM posteriors and Kalman estimates.
| PARAMETER | DESCRIPTION |
|---|---|
hmm_gamma
|
Smoothed state posteriors from HMM.
TYPE:
|
kalman_x
|
Kalman filtered estimates.
TYPE:
|
kalman_P
|
Kalman filtered uncertainties.
TYPE:
|
emission_params
|
{state_index: (mean, std)} from HMM.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fused_mean
|
TYPE:
|
fused_var
|
TYPE:
|
Source code in fplx/inference/fusion.py
hmm
¶
Hidden Markov Model for player form state inference.
Implements: - Forward algorithm (online filtering) - Forward-Backward (offline smoothing) - Viterbi decoding (most likely state sequence) - Dynamic transition matrix perturbation (news signal injection) - Baum-Welch parameter learning (EM) - One-step-ahead prediction with uncertainty
HMMInference
¶
HMMInference(
transition_matrix: Optional[ndarray] = None,
emission_params: Optional[dict] = None,
initial_dist: Optional[ndarray] = None,
)
Hidden Markov Model for discrete player form states.
Supports dynamic transition matrix perturbation so that external signals (news, injuries) can shift state probabilities mid-sequence.
| PARAMETER | DESCRIPTION |
|---|---|
transition_matrix
|
transition_matrix[i,j] = P(S_{t+1}=j | S_t=i). Rows must sum to 1.
TYPE:
|
emission_params
|
{state_index: (mean, std)} for Gaussian emissions.
TYPE:
|
initial_dist
|
Prior over initial state.
TYPE:
|
Source code in fplx/inference/hmm.py
inject_news_perturbation
¶
Perturb transition matrix at a specific timestep based on news.
For each source state, the transition probability toward boosted target states is multiplied by the boost factor (scaled by confidence), then the row is renormalized.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
The gameweek at which the perturbation applies.
TYPE:
|
state_boost
|
{target_state: multiplicative_boost}. E.g., {0: 10.0} means "10x more likely to transition to Injured."
TYPE:
|
confidence
|
Scales the perturbation. 0 = no effect, 1 = full effect.
TYPE:
|
Source code in fplx/inference/hmm.py
clear_perturbations
¶
forward
¶
Forward algorithm with dynamic transition matrices.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
forward_messages
|
Normalized forward messages. forward_messages[t] = P(S_t | y_1:t)
TYPE:
|
scale
|
Per-timestep normalization constants.
TYPE:
|
Source code in fplx/inference/hmm.py
forward_backward
¶
Compute smoothed posteriors P(S_t | y_1:num_timesteps).
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
smoothed_posteriors
|
smoothed_posteriors[t, s] = P(S_t=s | y_1:num_timesteps)
TYPE:
|
Source code in fplx/inference/hmm.py
viterbi
¶
Most likely state sequence via Viterbi decoding.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
best_path
|
TYPE:
|
Source code in fplx/inference/hmm.py
predict_next
¶
Predict next timestep's points distribution.
Runs forward algorithm, then propagates one step ahead via the transition matrix.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
E[Y_{num_timesteps+1} | y_1:num_timesteps]
TYPE:
|
variance
|
Var[Y_{num_timesteps+1} | y_1:num_timesteps] (from law of total variance)
TYPE:
|
next_state_dist
|
P(S_{num_timesteps+1} | y_1:num_timesteps)
TYPE:
|
Source code in fplx/inference/hmm.py
fit
¶
Learn transition matrix and emission parameters via Baum-Welch EM.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
Training sequence.
TYPE:
|
n_iter
|
Maximum EM iterations.
TYPE:
|
tol
|
Convergence tolerance on log-likelihood.
TYPE:
|
verbose
|
Print progress.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
self
|
|
Source code in fplx/inference/hmm.py
283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 | |
kalman
¶
Kalman Filter for continuous player point potential tracking.
State model: x_{t+1} = x_t + w_t, w_t ~ N(0, Q_t) Observation: y_t = x_t + v_t, v_t ~ N(0, R_t)
Supports per-timestep noise overrides so that: - News shocks (injury) → inflate Q_t (true form can jump suddenly) - Fixture difficulty → inflate R_t (harder opponents → noisier observations)
KalmanFilter
¶
KalmanFilter(
process_noise: float = 1.0,
observation_noise: float = 4.0,
initial_state_mean: float = 4.0,
initial_state_covariance: float = 2.0,
)
1D Kalman Filter for tracking latent point potential.
| PARAMETER | DESCRIPTION |
|---|---|
process_noise
|
Default process noise variance (form drift rate).
TYPE:
|
observation_noise
|
Default observation noise variance (weekly point noise).
TYPE:
|
initial_state_mean
|
Initial state estimate.
TYPE:
|
initial_state_covariance
|
Initial state uncertainty (variance).
TYPE:
|
Source code in fplx/inference/kalman.py
inject_process_shock
¶
Inflate process noise at a specific timestep.
Use when news indicates a sudden form change (injury, transfer). process_noise_t = default_process_noise * multiplier.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
Gameweek index.
TYPE:
|
multiplier
|
Process noise multiplier (>1 = more uncertainty about form drift).
TYPE:
|
Source code in fplx/inference/kalman.py
inject_observation_noise
¶
Adjust observation noise at a specific timestep.
Use for fixture difficulty: harder opponents → less predictable points. observation_noise_t = default_observation_noise * factor.
| PARAMETER | DESCRIPTION |
|---|---|
timestep
|
Gameweek index.
TYPE:
|
factor
|
Observation noise factor (>1 = harder fixture, noisier observation).
TYPE:
|
Source code in fplx/inference/kalman.py
clear_overrides
¶
get_process_noise_override
¶
set_noise_overrides
¶
set_noise_overrides(
process_noise_overrides: dict[int, float],
observation_noise_overrides: dict[int, float],
)
Replace per-timestep noise overrides.
Source code in fplx/inference/kalman.py
copy_with_overrides
¶
copy_with_overrides(
max_timestep: Optional[int] = None,
) -> KalmanFilter
Create a parameter-identical filter with copied noise overrides.
| PARAMETER | DESCRIPTION |
|---|---|
max_timestep
|
If provided, only overrides for timesteps <= max_timestep are copied.
TYPE:
|
Source code in fplx/inference/kalman.py
filter
¶
Run Kalman filter on observations with per-timestep noise.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
filtered_state_means
|
Filtered state estimates (posterior mean).
TYPE:
|
filtered_state_covariances
|
Filtered state uncertainties (posterior variance).
TYPE:
|
Source code in fplx/inference/kalman.py
predict_next
¶
Predict next observation with uncertainty.
Returns the predictive distribution for Y_{t+1} (the observation), not X_{t+1} (the latent state). This ensures consistency with the HMM predict_next which also returns observation-level variance.
Var[Y_{t+1}] = Var[X_{t+1}|y_{1:t}] + R = (P_t + Q) + R
Must call filter() first.
| RETURNS | DESCRIPTION |
|---|---|
predicted_mean
|
E[Y_{t+1} | y_{1:t}].
TYPE:
|
predicted_var
|
Var[Y_{t+1} | y_{1:t}] (observation-level, includes R).
TYPE:
|
Source code in fplx/inference/kalman.py
smooth
¶
Run RTS smoother (backward pass after forward Kalman filter).
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
smoothed_state_means
|
Smoothed state estimates.
TYPE:
|
smoothed_state_covariances
|
Smoothed state uncertainties.
TYPE:
|
Source code in fplx/inference/kalman.py
multivariate_hmm
¶
Position-aware multivariate-emission HMM for player form inference.
Uses position-specific feature vectors extracted from the full vaastav dataset:
GK: [saves/90, xGC/90, clean_sheet, bonus, mins_frac] DEF: [xG, xA, xGC/90, clean_sheet, influence/100, bonus, mins_frac] MID: [xG, xA, creativity/100, threat/100, bonus, mins_frac] FWD: [xG, xA, threat/100, bonus, mins_frac]
Each state emits a multivariate Gaussian with diagonal covariance. Baum-Welch learns per-player emission parameters from their history.
The minutes_fraction feature (0 or ~1) lets the HMM identify the Injured state from the feature vector alone, without NLP news signals.
MultivariateHMM
¶
MultivariateHMM(
position: str = "MID",
transition_matrix: Optional[ndarray] = None,
initial_dist: Optional[ndarray] = None,
)
Position-aware HMM with multivariate diagonal Gaussian emissions.
| PARAMETER | DESCRIPTION |
|---|---|
position
|
GK, DEF, MID, FWD. Determines feature set and default emissions.
TYPE:
|
Source code in fplx/inference/multivariate_hmm.py
inject_news_perturbation
¶
Perturb transition matrix at timestep (same API as scalar HMM).
Source code in fplx/inference/multivariate_hmm.py
forward
¶
Forward algorithm. observations: (T, D).
Source code in fplx/inference/multivariate_hmm.py
forward_backward
¶
Smoothed posteriors P(S_t | y_{1:T}).
Source code in fplx/inference/multivariate_hmm.py
viterbi
¶
Most likely state sequence.
Source code in fplx/inference/multivariate_hmm.py
predict_next_features
¶
Predict next gameweek's feature vector.
Returns mean, var (per feature), and state distribution.
Source code in fplx/inference/multivariate_hmm.py
one_step_point_predictions
¶
One-step-ahead point predictions for each historical timestep.
Returns array preds where preds[t] predicts points at timestep t, using information up to t-1 (preds[0] is NaN).
Source code in fplx/inference/multivariate_hmm.py
predict_next_points
¶
Convert predicted features → expected FPL points.
Uses FPL scoring rules applied to predicted feature rates.
Source code in fplx/inference/multivariate_hmm.py
fit
¶
Baum-Welch EM with MAP-style prior interpolation.
| PARAMETER | DESCRIPTION |
|---|---|
observations
|
Feature matrix with shape (T, D).
TYPE:
|
n_iter
|
Maximum EM iterations.
TYPE:
|
tol
|
Convergence tolerance on log-likelihood.
TYPE:
|
prior_weight
|
Weight on prior parameters in [0, 1]. Higher values increase regularization toward position-level default emissions/transitions.
TYPE:
|
Source code in fplx/inference/multivariate_hmm.py
280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 | |
build_feature_matrix
¶
Extract position-specific feature matrix from player timeseries.
| PARAMETER | DESCRIPTION |
|---|---|
timeseries
|
Player gameweek history from vaastav dataset.
TYPE:
|
position
|
GK, DEF, MID, or FWD.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
np.ndarray, shape (T, D) where D depends on position.
|
|
Source code in fplx/inference/multivariate_hmm.py
pipeline
¶
Per-player inference pipeline orchestrator.
This is the single entry point that FPLModel.fit() calls for each player. It coordinates HMM, Kalman Filter, signal injection, and fusion.
Usage: pipeline = PlayerInferencePipeline() pipeline.ingest_observations(points_array) pipeline.inject_news("Player ruled out for 3 weeks", timestep=20) pipeline.inject_fixture_difficulty(difficulty=4.5, timestep=21) results = pipeline.run() ep_mean, ep_var = pipeline.predict_next()
InferenceResult
dataclass
¶
InferenceResult(
filtered_beliefs: ndarray,
smoothed_beliefs: ndarray,
viterbi_path: ndarray,
hmm_predicted_mean: float = 0.0,
hmm_predicted_var: float = 0.0,
kalman_filtered: ndarray = (lambda: array([]))(),
kalman_uncertainty: ndarray = (lambda: array([]))(),
kf_predicted_mean: float = 0.0,
kf_predicted_var: float = 0.0,
fused_mean: ndarray = (lambda: array([]))(),
fused_var: ndarray = (lambda: array([]))(),
fusion_alpha: Optional[float] = None,
predicted_mean: float = 0.0,
predicted_var: float = 0.0,
)
Container for inference pipeline outputs.
PlayerInferencePipeline
¶
PlayerInferencePipeline(
hmm_params: Optional[dict] = None,
kf_params: Optional[dict] = None,
hmm_variance_floor: float = 1.0,
news_params: Optional[dict] = None,
fusion_mode: str = "precision",
fusion_params: Optional[dict] = None,
)
Orchestrates HMM + Kalman inference for a single player.
| PARAMETER | DESCRIPTION |
|---|---|
hmm_params
|
Override HMM parameters: transition_matrix, emission_params, initial_dist.
TYPE:
|
kf_params
|
Override Kalman parameters: Q, R, x0, P0.
TYPE:
|
Source code in fplx/inference/pipeline.py
ingest_observations
¶
Set the player's historical points sequence.
| PARAMETER | DESCRIPTION |
|---|---|
points
|
Weekly points history.
TYPE:
|
Source code in fplx/inference/pipeline.py
inject_news
¶
Inject a news signal into the inference at a specific gameweek.
Bridges from existing NewsSignal.generate_signal() output format.
| PARAMETER | DESCRIPTION |
|---|---|
news_signal
|
Output from NewsSignal.generate_signal(). Must contain: 'availability', 'minutes_risk', 'confidence'.
TYPE:
|
timestep
|
The gameweek index to apply the perturbation.
TYPE:
|
Source code in fplx/inference/pipeline.py
inject_fixture_difficulty
¶
Inject fixture difficulty into Kalman observation noise.
| PARAMETER | DESCRIPTION |
|---|---|
difficulty
|
Fixture difficulty score (1-5, from FixtureSignal).
TYPE:
|
timestep
|
The gameweek index.
TYPE:
|
Source code in fplx/inference/pipeline.py
run
¶
run() -> InferenceResult
Run full inference pipeline: HMM + Kalman + Fusion.
| RETURNS | DESCRIPTION |
|---|---|
InferenceResult
|
All inference outputs. |
Source code in fplx/inference/pipeline.py
predict_next
¶
Get the fused one-step-ahead forecast.
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
TYPE:
|
variance
|
TYPE:
|
Source code in fplx/inference/pipeline.py
learn_parameters
¶
Run Baum-Welch to learn HMM parameters from current observations.
Call this before run() if you want data-driven parameters.
Source code in fplx/inference/pipeline.py
tft
¶
Temporal Fusion Transformer (TFT) inference adapter.
This module provides optional deep-learning inference for FPLX using
pytorch-forecasting.
TFTQuantilePredictions
dataclass
¶
Container for TFT quantile outputs for a single gameweek.
to_optimizer_inputs
¶
Map quantiles to objective mean and downside risk.
| RETURNS | DESCRIPTION |
|---|---|
expected_points
|
Uses q50 as robust expected value proxy.
TYPE:
|
downside_risk
|
Uses q50 - q10 as downside spread.
TYPE:
|
Source code in fplx/inference/tft.py
TFTForecaster
¶
TFTForecaster(
quantiles: tuple[float, float, float] = (0.1, 0.5, 0.9),
encoder_length: int = 15,
prediction_length: int = 1,
)
Wrapper around PyTorch Forecasting's TemporalFusionTransformer.
Source code in fplx/inference/tft.py
fit
¶
fit(
panel_df: DataFrame,
training_cutoff: int,
max_epochs: int = 20,
batch_size: int = 256,
learning_rate: float = 0.001,
hidden_size: int = 32,
attention_head_size: int = 4,
dropout: float = 0.1,
)
Train TFT on panel data.
Source code in fplx/inference/tft.py
load
¶
Load a trained TFT checkpoint.
predict_gameweek
¶
predict_gameweek(
panel_df: DataFrame,
target_gw: int,
batch_size: int = 256,
) -> TFTQuantilePredictions
Predict quantiles for one target gameweek across all players.
Source code in fplx/inference/tft.py
models
¶
Machine learning models for FPL prediction.
BaselineModel
¶
Bases: BaseModel
Baseline model using simple heuristics.
Methods: - Rolling average of points - Weighted recent form - Form-based prediction
Initialize baseline model.
| PARAMETER | DESCRIPTION |
|---|---|
method
|
Prediction method: 'rolling_mean', 'ewma', 'last_value'
TYPE:
|
window
|
Window size for rolling calculations
TYPE:
|
Source code in fplx/models/baseline.py
fit
¶
predict
¶
Predict next gameweek points for a player.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Player historical data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Predicted points |
Source code in fplx/models/baseline.py
batch_predict
¶
Predict for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of predictions |
Source code in fplx/models/baseline.py
EnsembleModel
¶
Ensemble combining multiple models with weighted averaging.
| PARAMETER | DESCRIPTION |
|---|---|
models
|
List of model instances
TYPE:
|
weights
|
Weights for each model (must sum to 1)
TYPE:
|
Source code in fplx/models/ensemble.py
predict
¶
Ensemble prediction for a single player.
| PARAMETER | DESCRIPTION |
|---|---|
player_data
|
Player historical data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Ensemble prediction |
Source code in fplx/models/ensemble.py
batch_predict
¶
Ensemble predictions for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, float]
|
Dictionary of ensemble predictions |
Source code in fplx/models/ensemble.py
RegressionModel
¶
RegressionModel(
model_type: str = "ridge",
initial_train_size: int = 10,
test_size: int = 1,
step: int = 1,
**model_kwargs
)
Bases: BaseModel
Machine learning regression model for FPL predictions.
Adapted from the MLSP project's regressor patterns.
| PARAMETER | DESCRIPTION |
|---|---|
model_type
|
Type of model: 'ridge', 'xgboost', 'lightgbm'
TYPE:
|
initial_train_size
|
Size of initial training window
TYPE:
|
test_size
|
Forecast horizon
TYPE:
|
step
|
Rolling window step size
TYPE:
|
Source code in fplx/models/regression.py
fit
¶
predict
¶
Generate predictions.
Source code in fplx/models/regression.py
fit_predict
¶
Fit model and generate predictions using rolling CV.
| PARAMETER | DESCRIPTION |
|---|---|
y
|
Target time series (points to predict)
TYPE:
|
X
|
Feature matrix
TYPE:
|
verbose
|
Print progress
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Series
|
Predictions aligned with test indices |
Source code in fplx/models/regression.py
predict_next
¶
Predict next value given features.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Feature matrix (single row for next gameweek)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Predicted points |
Source code in fplx/models/regression.py
get_feature_importance
¶
Get feature importance (for tree-based models).
| PARAMETER | DESCRIPTION |
|---|---|
feature_names
|
Names of features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Feature importance scores |
Source code in fplx/models/regression.py
evaluate
¶
Evaluate model performance.
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of metrics |
Source code in fplx/models/regression.py
RollingCV
¶
Generates indices for rolling cross-validation splits.
This is adapted from the MLSP project for time-series validation.
| PARAMETER | DESCRIPTION |
|---|---|
initial_train_size
|
Size of the initial training set.
TYPE:
|
test_size
|
Size of the test set (forecast horizon).
TYPE:
|
step
|
Step size to move the training window forward.
TYPE:
|
Source code in fplx/models/rolling_cv.py
split
¶
Generate indices to split data into training and test sets.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Time series data.
TYPE:
|
| YIELDS | DESCRIPTION |
|---|---|
train_indices
|
The training set indices for that split.
TYPE::
|
test_indices
|
The testing set indices for that split.
TYPE::
|
Source code in fplx/models/rolling_cv.py
baseline
¶
Baseline heuristic models for FPL prediction.
BaselineModel
¶
Bases: BaseModel
Baseline model using simple heuristics.
Methods: - Rolling average of points - Weighted recent form - Form-based prediction
Initialize baseline model.
| PARAMETER | DESCRIPTION |
|---|---|
method
|
Prediction method: 'rolling_mean', 'ewma', 'last_value'
TYPE:
|
window
|
Window size for rolling calculations
TYPE:
|
Source code in fplx/models/baseline.py
fit
¶
predict
¶
Predict next gameweek points for a player.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Player historical data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Predicted points |
Source code in fplx/models/baseline.py
batch_predict
¶
Predict for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of predictions |
Source code in fplx/models/baseline.py
FormBasedModel
¶
Bases: BaselineModel
Enhanced baseline using form indicators.
Source code in fplx/models/baseline.py
predict
¶
Predict based on form with adjustments.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Player historical data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Predicted points |
Source code in fplx/models/baseline.py
ensemble
¶
Ensemble models combining multiple predictors.
EnsembleModel
¶
Ensemble combining multiple models with weighted averaging.
| PARAMETER | DESCRIPTION |
|---|---|
models
|
List of model instances
TYPE:
|
weights
|
Weights for each model (must sum to 1)
TYPE:
|
Source code in fplx/models/ensemble.py
predict
¶
Ensemble prediction for a single player.
| PARAMETER | DESCRIPTION |
|---|---|
player_data
|
Player historical data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Ensemble prediction |
Source code in fplx/models/ensemble.py
batch_predict
¶
Ensemble predictions for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, float]
|
Dictionary of ensemble predictions |
Source code in fplx/models/ensemble.py
AdaptiveEnsemble
¶
Bases: EnsembleModel
Adaptive ensemble that adjusts weights based on recent performance.
Source code in fplx/models/ensemble.py
update_weights
¶
Update weights based on recent errors.
Source code in fplx/models/ensemble.py
regression
¶
ML regression models for FPL prediction.
RegressionModel
¶
RegressionModel(
model_type: str = "ridge",
initial_train_size: int = 10,
test_size: int = 1,
step: int = 1,
**model_kwargs
)
Bases: BaseModel
Machine learning regression model for FPL predictions.
Adapted from the MLSP project's regressor patterns.
| PARAMETER | DESCRIPTION |
|---|---|
model_type
|
Type of model: 'ridge', 'xgboost', 'lightgbm'
TYPE:
|
initial_train_size
|
Size of initial training window
TYPE:
|
test_size
|
Forecast horizon
TYPE:
|
step
|
Rolling window step size
TYPE:
|
Source code in fplx/models/regression.py
fit
¶
predict
¶
Generate predictions.
Source code in fplx/models/regression.py
fit_predict
¶
Fit model and generate predictions using rolling CV.
| PARAMETER | DESCRIPTION |
|---|---|
y
|
Target time series (points to predict)
TYPE:
|
X
|
Feature matrix
TYPE:
|
verbose
|
Print progress
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Series
|
Predictions aligned with test indices |
Source code in fplx/models/regression.py
predict_next
¶
Predict next value given features.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Feature matrix (single row for next gameweek)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Predicted points |
Source code in fplx/models/regression.py
get_feature_importance
¶
Get feature importance (for tree-based models).
| PARAMETER | DESCRIPTION |
|---|---|
feature_names
|
Names of features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Feature importance scores |
Source code in fplx/models/regression.py
evaluate
¶
Evaluate model performance.
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of metrics |
Source code in fplx/models/regression.py
rolling_cv
¶
Rolling cross-validation for time-series models.
RollingCV
¶
Generates indices for rolling cross-validation splits.
This is adapted from the MLSP project for time-series validation.
| PARAMETER | DESCRIPTION |
|---|---|
initial_train_size
|
Size of the initial training set.
TYPE:
|
test_size
|
Size of the test set (forecast horizon).
TYPE:
|
step
|
Step size to move the training window forward.
TYPE:
|
Source code in fplx/models/rolling_cv.py
split
¶
Generate indices to split data into training and test sets.
| PARAMETER | DESCRIPTION |
|---|---|
X
|
Time series data.
TYPE:
|
| YIELDS | DESCRIPTION |
|---|---|
train_indices
|
The training set indices for that split.
TYPE::
|
test_indices
|
The testing set indices for that split.
TYPE::
|
Source code in fplx/models/rolling_cv.py
selection
¶
Squad selection and optimization.
BudgetConstraint
¶
FormationConstraints
¶
Formation constraints for FPL squad.
Rules: - Exactly 11 players - 1 GK - 3-5 DEF - 2-5 MID - 1-3 FWD
validate
classmethod
¶
validate(players: list[Player]) -> bool
Check if squad satisfies formation constraints.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
List of players in squad
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if valid formation |
Source code in fplx/selection/constraints.py
get_valid_formations
classmethod
¶
Get list of valid formation strings.
| RETURNS | DESCRIPTION |
|---|---|
List[str]
|
Valid formations (e.g., "3-4-3", "4-3-3") |
Source code in fplx/selection/constraints.py
SquadQuotas
¶
Position quotas for the 15-player FPL squad.
Rules: - 2 GK, 5 DEF, 5 MID, 3 FWD (exactly). - Total = 15 players.
TeamDiversityConstraint
¶
LagrangianOptimizer
¶
LagrangianOptimizer(
budget: float = 100.0,
max_from_team: int = 3,
max_iter: int = 200,
tol: float = 0.01,
risk_aversion: float = 0.0,
)
Lagrangian relaxation for the FPL squad selection ILP.
Relaxes the budget constraint into the objective:
L(lambda) = max_{x in X} sum_i (mu_i - lambda * c_i) * x_i + lambda * B
where X encodes squad size, position quotas, and team caps. The inner maximization decomposes: for each position, select the top-k players by modified score (mu_i - lambda * c_i).
The dual problem min_{lambda >= 0} L(lambda) is solved via subgradient ascent.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Total budget (default 100.0).
TYPE:
|
max_from_team
|
Maximum players from same club.
TYPE:
|
max_iter
|
Maximum subgradient iterations.
TYPE:
|
tol
|
Convergence tolerance on duality gap.
TYPE:
|
risk_aversion
|
Mean-variance penalty (same as ILP).
TYPE:
|
Source code in fplx/selection/lagrangian.py
solve
¶
solve(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
best_known_primal: Optional[float] = None,
) -> LagrangianResult
Solve via Lagrangian relaxation with subgradient ascent.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
TYPE:
|
expected_points
|
TYPE:
|
expected_variance
|
TYPE:
|
best_known_primal
|
Best known primal objective (e.g., from ILP). Used for better step size computation.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LagrangianResult
|
|
Source code in fplx/selection/lagrangian.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 | |
LagrangianResult
dataclass
¶
LagrangianResult(
full_squad: Optional[FullSquad] = None,
primal_objective: float = 0.0,
dual_bound: float = 0.0,
duality_gap: float = 0.0,
n_iterations: int = 0,
converged: bool = False,
solve_time: float = 0.0,
dual_history: list[float] = list(),
primal_history: list[float] = list(),
lambda_history: list[float] = list(),
budget_slack_history: list[float] = list(),
)
Convergence diagnostics for the Lagrangian solver.
GreedyOptimizer
¶
Bases: BaseOptimizer
Greedy baseline: select best-value players per position.
Fast heuristic for comparison. Selects 15-player squad, then picks best 11 as lineup.
Source code in fplx/selection/optimizer.py
optimize
¶
optimize(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
formation: Optional[str] = None,
) -> FullSquad
Greedy squad + lineup selection.
Source code in fplx/selection/optimizer.py
OptimizationResult
dataclass
¶
OptimizationResult(
full_squad: FullSquad,
objective_value: float = 0.0,
solve_time: float = 0.0,
lp_objective: Optional[float] = None,
integrality_gap: Optional[float] = None,
shadow_prices: dict = dict(),
binding_constraints: list = list(),
)
Container for optimization outputs including duality analysis.
TwoLevelILPOptimizer
¶
Bases: BaseOptimizer
Two-level ILP: select 15-player squad then 11-player lineup jointly.
Supports risk-neutral and risk-averse (mean-variance) objectives. Also exposes LP relaxation for shadow price extraction.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Maximum total squad budget (applied to 15 players).
TYPE:
|
max_from_team
|
Maximum players from same club.
TYPE:
|
risk_aversion
|
Lambda for mean-variance penalty. 0 = risk-neutral.
TYPE:
|
Source code in fplx/selection/optimizer.py
solve
¶
optimize
¶
optimize(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
downside_risk: Optional[dict[int, float]] = None,
formation: Optional[str] = None,
) -> FullSquad
Solve the two-level ILP.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
Available player pool.
TYPE:
|
expected_points
|
E[P_i] per player.
TYPE:
|
expected_variance
|
Var[P_i] per player.
TYPE:
|
downside_risk
|
Downside spread per player. If provided, risk penalty uses this directly (instead of sqrt(variance)).
TYPE:
|
formation
|
Not used (formation is optimized automatically).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
FullSquad
|
|
Source code in fplx/selection/optimizer.py
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 | |
solve_lp_relaxation
¶
solve_lp_relaxation(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
downside_risk: Optional[dict[int, float]] = None,
) -> OptimizationResult
Solve the LP relaxation and extract shadow prices.
| RETURNS | DESCRIPTION |
|---|---|
OptimizationResult
|
Contains LP objective, shadow prices, binding constraints. |
Source code in fplx/selection/optimizer.py
constraints
¶
Constraints for squad selection.
SquadQuotas
¶
Position quotas for the 15-player FPL squad.
Rules: - 2 GK, 5 DEF, 5 MID, 3 FWD (exactly). - Total = 15 players.
FormationConstraints
¶
Formation constraints for FPL squad.
Rules: - Exactly 11 players - 1 GK - 3-5 DEF - 2-5 MID - 1-3 FWD
validate
classmethod
¶
validate(players: list[Player]) -> bool
Check if squad satisfies formation constraints.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
List of players in squad
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if valid formation |
Source code in fplx/selection/constraints.py
get_valid_formations
classmethod
¶
Get list of valid formation strings.
| RETURNS | DESCRIPTION |
|---|---|
List[str]
|
Valid formations (e.g., "3-4-3", "4-3-3") |
Source code in fplx/selection/constraints.py
BudgetConstraint
¶
lagrangian
¶
Lagrangian dual decomposition for FPL squad selection.
Relaxes the budget constraint into the objective and solves via subgradient ascent. The inner problem decomposes into per-position sorting problems, each solvable in O(n log n).
This provides: - A dual upper bound on the ILP optimum - A near-optimal primal solution via rounding - Convergence diagnostics for the 18-660 report
LagrangianResult
dataclass
¶
LagrangianResult(
full_squad: Optional[FullSquad] = None,
primal_objective: float = 0.0,
dual_bound: float = 0.0,
duality_gap: float = 0.0,
n_iterations: int = 0,
converged: bool = False,
solve_time: float = 0.0,
dual_history: list[float] = list(),
primal_history: list[float] = list(),
lambda_history: list[float] = list(),
budget_slack_history: list[float] = list(),
)
Convergence diagnostics for the Lagrangian solver.
LagrangianOptimizer
¶
LagrangianOptimizer(
budget: float = 100.0,
max_from_team: int = 3,
max_iter: int = 200,
tol: float = 0.01,
risk_aversion: float = 0.0,
)
Lagrangian relaxation for the FPL squad selection ILP.
Relaxes the budget constraint into the objective:
L(lambda) = max_{x in X} sum_i (mu_i - lambda * c_i) * x_i + lambda * B
where X encodes squad size, position quotas, and team caps. The inner maximization decomposes: for each position, select the top-k players by modified score (mu_i - lambda * c_i).
The dual problem min_{lambda >= 0} L(lambda) is solved via subgradient ascent.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Total budget (default 100.0).
TYPE:
|
max_from_team
|
Maximum players from same club.
TYPE:
|
max_iter
|
Maximum subgradient iterations.
TYPE:
|
tol
|
Convergence tolerance on duality gap.
TYPE:
|
risk_aversion
|
Mean-variance penalty (same as ILP).
TYPE:
|
Source code in fplx/selection/lagrangian.py
solve
¶
solve(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
best_known_primal: Optional[float] = None,
) -> LagrangianResult
Solve via Lagrangian relaxation with subgradient ascent.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
TYPE:
|
expected_points
|
TYPE:
|
expected_variance
|
TYPE:
|
best_known_primal
|
Best known primal objective (e.g., from ILP). Used for better step size computation.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LagrangianResult
|
|
Source code in fplx/selection/lagrangian.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 | |
optimizer
¶
Squad optimization: two-level ILP, mean-variance, LP relaxation.
OptimizationResult
dataclass
¶
OptimizationResult(
full_squad: FullSquad,
objective_value: float = 0.0,
solve_time: float = 0.0,
lp_objective: Optional[float] = None,
integrality_gap: Optional[float] = None,
shadow_prices: dict = dict(),
binding_constraints: list = list(),
)
Container for optimization outputs including duality analysis.
TwoLevelILPOptimizer
¶
Bases: BaseOptimizer
Two-level ILP: select 15-player squad then 11-player lineup jointly.
Supports risk-neutral and risk-averse (mean-variance) objectives. Also exposes LP relaxation for shadow price extraction.
| PARAMETER | DESCRIPTION |
|---|---|
budget
|
Maximum total squad budget (applied to 15 players).
TYPE:
|
max_from_team
|
Maximum players from same club.
TYPE:
|
risk_aversion
|
Lambda for mean-variance penalty. 0 = risk-neutral.
TYPE:
|
Source code in fplx/selection/optimizer.py
solve
¶
optimize
¶
optimize(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
downside_risk: Optional[dict[int, float]] = None,
formation: Optional[str] = None,
) -> FullSquad
Solve the two-level ILP.
| PARAMETER | DESCRIPTION |
|---|---|
players
|
Available player pool.
TYPE:
|
expected_points
|
E[P_i] per player.
TYPE:
|
expected_variance
|
Var[P_i] per player.
TYPE:
|
downside_risk
|
Downside spread per player. If provided, risk penalty uses this directly (instead of sqrt(variance)).
TYPE:
|
formation
|
Not used (formation is optimized automatically).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
FullSquad
|
|
Source code in fplx/selection/optimizer.py
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 | |
solve_lp_relaxation
¶
solve_lp_relaxation(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
downside_risk: Optional[dict[int, float]] = None,
) -> OptimizationResult
Solve the LP relaxation and extract shadow prices.
| RETURNS | DESCRIPTION |
|---|---|
OptimizationResult
|
Contains LP objective, shadow prices, binding constraints. |
Source code in fplx/selection/optimizer.py
GreedyOptimizer
¶
Bases: BaseOptimizer
Greedy baseline: select best-value players per position.
Fast heuristic for comparison. Selects 15-player squad, then picks best 11 as lineup.
Source code in fplx/selection/optimizer.py
optimize
¶
optimize(
players: list[Player],
expected_points: dict[int, float],
expected_variance: Optional[dict[int, float]] = None,
formation: Optional[str] = None,
) -> FullSquad
Greedy squad + lineup selection.
Source code in fplx/selection/optimizer.py
signals
¶
Signal generation modules for player scoring.
FixtureSignal
¶
Bases: BaseSignal
Generate signals based on fixture difficulty and schedule.
Initialize with team difficulty ratings.
| PARAMETER | DESCRIPTION |
|---|---|
difficulty_ratings
|
Team strength ratings (1-5, higher = harder opponent)
TYPE:
|
Source code in fplx/signals/fixtures.py
generate_signal
¶
Generate fixture-based signal.
Source code in fplx/signals/fixtures.py
set_difficulty_ratings
¶
Set or update difficulty ratings.
| PARAMETER | DESCRIPTION |
|---|---|
ratings
|
Team strength ratings
TYPE:
|
compute_fixture_difficulty
¶
compute_fixture_difficulty(
team: str,
upcoming_opponents: list[str],
is_home: list[bool],
) -> float
Compute fixture difficulty score for upcoming games.
| PARAMETER | DESCRIPTION |
|---|---|
team
|
Player's team
TYPE:
|
upcoming_opponents
|
List of upcoming opponent teams
TYPE:
|
is_home
|
Whether each fixture is home
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Difficulty score (lower = easier fixtures) |
Source code in fplx/signals/fixtures.py
compute_fixture_advantage
¶
compute_fixture_advantage(
team: str,
upcoming_opponents: list[str],
is_home: list[bool],
) -> float
Compute fixture advantage (inverse of difficulty).
Higher score = easier fixtures = better for player.
| PARAMETER | DESCRIPTION |
|---|---|
team
|
Player's team
TYPE:
|
upcoming_opponents
|
List of upcoming opponent teams
TYPE:
|
is_home
|
Whether each fixture is home
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Advantage score (0-1, higher = better fixtures) |
Source code in fplx/signals/fixtures.py
compute_fixture_congestion
¶
Compute fixture congestion (number of games in short period).
| PARAMETER | DESCRIPTION |
|---|---|
fixtures
|
Fixtures dataframe
TYPE:
|
team
|
Team name
TYPE:
|
days_window
|
Days to look ahead
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Congestion score (0-1, higher = more congested) |
Source code in fplx/signals/fixtures.py
batch_compute_advantages
¶
batch_compute_advantages(
players_teams: dict[str, str],
fixtures_data: dict[str, tuple],
) -> dict[str, float]
Compute fixture advantages for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_teams
|
Mapping of player ID to team
TYPE:
|
fixtures_data
|
Mapping of team to (opponents, is_home) tuples
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of player fixture advantage scores |
Source code in fplx/signals/fixtures.py
NewsParser
¶
Parse and interpret FPL news text into structured signals.
parse_availability
¶
Parse availability from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Availability score (0-1) |
Source code in fplx/signals/news.py
parse_minutes_risk
¶
Parse minutes risk from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Minutes risk score (0-1, higher = more risk) |
Source code in fplx/signals/news.py
parse_confidence
¶
Estimate confidence in the parsed signal.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Confidence score (0-1) |
Source code in fplx/signals/news.py
NewsSignal
¶
Bases: BaseSignal
Generate structured news signals for players.
generate_signal
¶
Generate signal from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary with availability, minutes_risk, confidence |
Source code in fplx/signals/news.py
batch_generate
¶
Generate signals for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
news_dict
|
Dictionary mapping player ID to news text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, dict[str, float]]
|
Dictionary of player signals |
Source code in fplx/signals/news.py
StatsSignal
¶
Generate performance signals from statistical data.
Combines multiple statistical indicators into a unified score.
Initialize with custom weights for different stats.
| PARAMETER | DESCRIPTION |
|---|---|
weights
|
Weights for different statistics
TYPE:
|
Source code in fplx/signals/stats.py
compute_signal
¶
Compute aggregated signal score from player statistics.
| PARAMETER | DESCRIPTION |
|---|---|
player_data
|
Player historical data with engineered features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Aggregated signal score (0-100) |
Source code in fplx/signals/stats.py
batch_compute
¶
Compute signals for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID/name to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of player signals |
Source code in fplx/signals/stats.py
fixtures
¶
Fixture difficulty signals.
FixtureSignal
¶
Bases: BaseSignal
Generate signals based on fixture difficulty and schedule.
Initialize with team difficulty ratings.
| PARAMETER | DESCRIPTION |
|---|---|
difficulty_ratings
|
Team strength ratings (1-5, higher = harder opponent)
TYPE:
|
Source code in fplx/signals/fixtures.py
generate_signal
¶
Generate fixture-based signal.
Source code in fplx/signals/fixtures.py
set_difficulty_ratings
¶
Set or update difficulty ratings.
| PARAMETER | DESCRIPTION |
|---|---|
ratings
|
Team strength ratings
TYPE:
|
compute_fixture_difficulty
¶
compute_fixture_difficulty(
team: str,
upcoming_opponents: list[str],
is_home: list[bool],
) -> float
Compute fixture difficulty score for upcoming games.
| PARAMETER | DESCRIPTION |
|---|---|
team
|
Player's team
TYPE:
|
upcoming_opponents
|
List of upcoming opponent teams
TYPE:
|
is_home
|
Whether each fixture is home
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Difficulty score (lower = easier fixtures) |
Source code in fplx/signals/fixtures.py
compute_fixture_advantage
¶
compute_fixture_advantage(
team: str,
upcoming_opponents: list[str],
is_home: list[bool],
) -> float
Compute fixture advantage (inverse of difficulty).
Higher score = easier fixtures = better for player.
| PARAMETER | DESCRIPTION |
|---|---|
team
|
Player's team
TYPE:
|
upcoming_opponents
|
List of upcoming opponent teams
TYPE:
|
is_home
|
Whether each fixture is home
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Advantage score (0-1, higher = better fixtures) |
Source code in fplx/signals/fixtures.py
compute_fixture_congestion
¶
Compute fixture congestion (number of games in short period).
| PARAMETER | DESCRIPTION |
|---|---|
fixtures
|
Fixtures dataframe
TYPE:
|
team
|
Team name
TYPE:
|
days_window
|
Days to look ahead
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Congestion score (0-1, higher = more congested) |
Source code in fplx/signals/fixtures.py
batch_compute_advantages
¶
batch_compute_advantages(
players_teams: dict[str, str],
fixtures_data: dict[str, tuple],
) -> dict[str, float]
Compute fixture advantages for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_teams
|
Mapping of player ID to team
TYPE:
|
fixtures_data
|
Mapping of team to (opponents, is_home) tuples
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of player fixture advantage scores |
Source code in fplx/signals/fixtures.py
news
¶
News and injury signal processing.
NewsParser
¶
Parse and interpret FPL news text into structured signals.
parse_availability
¶
Parse availability from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Availability score (0-1) |
Source code in fplx/signals/news.py
parse_minutes_risk
¶
Parse minutes risk from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Minutes risk score (0-1, higher = more risk) |
Source code in fplx/signals/news.py
parse_confidence
¶
Estimate confidence in the parsed signal.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Confidence score (0-1) |
Source code in fplx/signals/news.py
NewsSignal
¶
Bases: BaseSignal
Generate structured news signals for players.
generate_signal
¶
Generate signal from news text.
| PARAMETER | DESCRIPTION |
|---|---|
news_text
|
News text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary with availability, minutes_risk, confidence |
Source code in fplx/signals/news.py
batch_generate
¶
Generate signals for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
news_dict
|
Dictionary mapping player ID to news text
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, dict[str, float]]
|
Dictionary of player signals |
Source code in fplx/signals/news.py
stats
¶
Statistical performance signals.
StatsSignal
¶
Generate performance signals from statistical data.
Combines multiple statistical indicators into a unified score.
Initialize with custom weights for different stats.
| PARAMETER | DESCRIPTION |
|---|---|
weights
|
Weights for different statistics
TYPE:
|
Source code in fplx/signals/stats.py
compute_signal
¶
Compute aggregated signal score from player statistics.
| PARAMETER | DESCRIPTION |
|---|---|
player_data
|
Player historical data with engineered features
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Aggregated signal score (0-100) |
Source code in fplx/signals/stats.py
batch_compute
¶
Compute signals for multiple players.
| PARAMETER | DESCRIPTION |
|---|---|
players_data
|
Dictionary mapping player ID/name to their data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, float]
|
Dictionary of player signals |
Source code in fplx/signals/stats.py
timeseries
¶
Time-series feature engineering and transformations.
FeatureEngineer
¶
Feature engineering pipeline for player time-series data.
| PARAMETER | DESCRIPTION |
|---|---|
config
|
Feature configuration dictionary
TYPE:
|
Source code in fplx/timeseries/features.py
fit_transform
¶
Apply all feature engineering transformations.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input player timeseries data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Transformed data with engineered features |
Source code in fplx/timeseries/features.py
get_feature_names
¶
Get list of all generated feature names.
| PARAMETER | DESCRIPTION |
|---|---|
base_columns
|
Base column names
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[str]
|
Generated feature names |
Source code in fplx/timeseries/features.py
create_future_features
¶
Create features for future predictions.
This method extends the historical data by horizon periods,
applies the full feature engineering pipeline, and returns
the newly created future feature set.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Historical data
TYPE:
|
horizon
|
Number of future gameweeks to predict
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with features for future gameweeks |
Source code in fplx/timeseries/features.py
add_ewma_features
¶
add_ewma_features(
df: DataFrame,
columns: list[str],
alphas: list[float] = [0.3, 0.5, 0.7],
) -> DataFrame
Add exponentially weighted moving average features.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to compute EWMA for
TYPE:
|
alphas
|
Smoothing factors (0 < alpha < 1)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with EWMA features |
Source code in fplx/timeseries/transforms.py
add_lag_features
¶
Add lagged features to dataframe.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to create lags for
TYPE:
|
lags
|
Lag periods
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with lagged features |
Source code in fplx/timeseries/transforms.py
add_rolling_features
¶
add_rolling_features(
df: DataFrame,
columns: list[str],
windows: list[int] = [3, 5, 10],
agg_funcs: list[str] = ["mean", "std"],
min_periods: int = 1,
) -> DataFrame
Add rolling window features to dataframe.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe with time-series data
TYPE:
|
columns
|
Columns to compute rolling features for
TYPE:
|
windows
|
Window sizes for rolling computation
TYPE:
|
agg_funcs
|
Aggregation functions ('mean', 'std', 'min', 'max', 'sum')
TYPE:
|
min_periods
|
Minimum observations in window
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with added rolling features |
Source code in fplx/timeseries/transforms.py
add_trend_features
¶
Add trend features (slope) using linear regression.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to compute trends for
TYPE:
|
windows
|
Window sizes for trend calculation
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with trend features |
Source code in fplx/timeseries/transforms.py
features
¶
Feature engineering pipeline for FPL time-series data.
FeatureEngineer
¶
Feature engineering pipeline for player time-series data.
| PARAMETER | DESCRIPTION |
|---|---|
config
|
Feature configuration dictionary
TYPE:
|
Source code in fplx/timeseries/features.py
fit_transform
¶
Apply all feature engineering transformations.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input player timeseries data
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Transformed data with engineered features |
Source code in fplx/timeseries/features.py
get_feature_names
¶
Get list of all generated feature names.
| PARAMETER | DESCRIPTION |
|---|---|
base_columns
|
Base column names
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[str]
|
Generated feature names |
Source code in fplx/timeseries/features.py
create_future_features
¶
Create features for future predictions.
This method extends the historical data by horizon periods,
applies the full feature engineering pipeline, and returns
the newly created future feature set.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Historical data
TYPE:
|
horizon
|
Number of future gameweeks to predict
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with features for future gameweeks |
Source code in fplx/timeseries/features.py
transforms
¶
Time-series transformations for FPL data.
add_rolling_features
¶
add_rolling_features(
df: DataFrame,
columns: list[str],
windows: list[int] = [3, 5, 10],
agg_funcs: list[str] = ["mean", "std"],
min_periods: int = 1,
) -> DataFrame
Add rolling window features to dataframe.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe with time-series data
TYPE:
|
columns
|
Columns to compute rolling features for
TYPE:
|
windows
|
Window sizes for rolling computation
TYPE:
|
agg_funcs
|
Aggregation functions ('mean', 'std', 'min', 'max', 'sum')
TYPE:
|
min_periods
|
Minimum observations in window
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with added rolling features |
Source code in fplx/timeseries/transforms.py
add_lag_features
¶
Add lagged features to dataframe.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to create lags for
TYPE:
|
lags
|
Lag periods
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with lagged features |
Source code in fplx/timeseries/transforms.py
add_ewma_features
¶
add_ewma_features(
df: DataFrame,
columns: list[str],
alphas: list[float] = [0.3, 0.5, 0.7],
) -> DataFrame
Add exponentially weighted moving average features.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to compute EWMA for
TYPE:
|
alphas
|
Smoothing factors (0 < alpha < 1)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with EWMA features |
Source code in fplx/timeseries/transforms.py
add_trend_features
¶
Add trend features (slope) using linear regression.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to compute trends for
TYPE:
|
windows
|
Window sizes for trend calculation
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with trend features |
Source code in fplx/timeseries/transforms.py
add_diff_features
¶
Add difference features (current - previous).
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to compute differences for
TYPE:
|
periods
|
Difference periods
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with difference features |
Source code in fplx/timeseries/transforms.py
add_consistency_features
¶
Add consistency measures (coefficient of variation).
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
TYPE:
|
columns
|
Columns to measure consistency for
TYPE:
|
window
|
Window size
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
DataFrame with consistency features |
Source code in fplx/timeseries/transforms.py
utils
¶
Utility modules.
Config
¶
Configuration manager for FPLX.
| PARAMETER | DESCRIPTION |
|---|---|
config
|
Configuration dictionary
TYPE:
|
Source code in fplx/utils/config.py
get
¶
Get configuration value.
| PARAMETER | DESCRIPTION |
|---|---|
key
|
Configuration key (supports nested keys with '.')
TYPE:
|
default
|
Default value if key not found
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Any
|
Configuration value |
Source code in fplx/utils/config.py
set
¶
Set configuration value.
| PARAMETER | DESCRIPTION |
|---|---|
key
|
Configuration key (supports nested keys with '.')
TYPE:
|
value
|
Value to set
TYPE:
|
Source code in fplx/utils/config.py
load_from_file
¶
Load configuration from JSON file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to configuration file
TYPE:
|
Source code in fplx/utils/config.py
save_to_file
¶
Save configuration to JSON file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to save configuration
TYPE:
|
to_dict
¶
validate_data
¶
Validate that dataframe has required columns.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Dataframe to validate
TYPE:
|
required_columns
|
Required column names
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if valid |
Source code in fplx/utils/validation.py
config
¶
Configuration management.
Config
¶
Configuration manager for FPLX.
| PARAMETER | DESCRIPTION |
|---|---|
config
|
Configuration dictionary
TYPE:
|
Source code in fplx/utils/config.py
get
¶
Get configuration value.
| PARAMETER | DESCRIPTION |
|---|---|
key
|
Configuration key (supports nested keys with '.')
TYPE:
|
default
|
Default value if key not found
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Any
|
Configuration value |
Source code in fplx/utils/config.py
set
¶
Set configuration value.
| PARAMETER | DESCRIPTION |
|---|---|
key
|
Configuration key (supports nested keys with '.')
TYPE:
|
value
|
Value to set
TYPE:
|
Source code in fplx/utils/config.py
load_from_file
¶
Load configuration from JSON file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to configuration file
TYPE:
|
Source code in fplx/utils/config.py
save_to_file
¶
Save configuration to JSON file.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to save configuration
TYPE:
|
to_dict
¶
validation
¶
Data validation utilities.
validate_data
¶
Validate that dataframe has required columns.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Dataframe to validate
TYPE:
|
required_columns
|
Required column names
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if valid |
Source code in fplx/utils/validation.py
check_data_quality
¶
Check data quality and report issues.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Data to check
TYPE:
|
max_missing_pct
|
Maximum acceptable percentage of missing values
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, float]
|
Quality metrics |
Source code in fplx/utils/validation.py
impute_missing
¶
Impute missing values.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Data with missing values
TYPE:
|
strategy
|
Imputation strategy: 'mean', 'median', 'forward_fill', 'zero'
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
Data with imputed values |