FPL AI Predictor

Full-stack ML application predicting Fantasy Premier League player points and providing AI-driven gameweek strategy. XGBoost model on per-gameweek rolling averages, SGDRegressor for incremental online updates.

Python 3.11XGBoostscikit-learnSGDRegressorFastAPIpandasnumpyjoblibReact 19ViteTailwind CSS v4RechartspytestGitHub Actions

View Code

Role

ML Engineer & Full-stack Developer

Team

Solo

Company/Organization

Personal Project

The Problem

•

FPL managers making weekly captain picks, transfers, and chip decisions rely on manual fixture analysis and community opinion — no data-driven...

•

The official FPL app shows raw stats but no next-gameweek point predictions, no fixture-adjusted recommendations, and no risk profiling for captain...

•

Predicting FPL points required per-gameweek feature engineering (rolling averages, form, FDR, home/away) from the official FPL API, which returns...

•

Incremental model updates after each gameweek — incorporating the latest results without full retraining — required an online learning approach...

•

Captain selection involves multi-factor reasoning: predicted points, fixture difficulty, home/away advantage, rotation risk, opponent defensive...

•

Chip timing (wildcard, bench boost, triple captain, free hit) depends on fixture swings, squad state, and transfer debt — no open-source FPL tool...

The Solution

•

Built a full ML pipeline from live data ingestion to interactive React dashboard with a multi-layer decision engine.

Data Pipeline (data_pipeline.py)

•

Fetches live FPL data from three official API endpoints: bootstrap-static (all players, teams, gameweek info), fixtures (full season schedule with...

•

Feature engineering per player per gameweek: rolling 3-GW and 5-GW average points, rolling average minutes, rolling average goals/assists/clean...

•

Outputs structured CSV datasets for training.

ML Models

•

XGBoost (train_model.py) — primary prediction model. Features: rolling averages (3-GW, 5-GW points/minutes), FDR, home/away, position one-hot...

•

SGDRegressor (incremental_trainer.py) — online learning model for incremental updates. After each gameweek, new results are used to `partial_fit`...

•

Prediction blending — final next-GW prediction blends XGBoost (primary) and SGDRegressor (secondary) outputs weighted by recency.

Scoring Engine (scoring_engine.py)

•

Implements official FPL 2025/26 scoring rules: - Goals: GK/DEF = 6pts, MID = 5pts, FWD = 4pts - GK goals = 10pts (new 2025/26 rule) - Clean sheets:...

Decision Engine (decision_engine.py)

•

Fixture-adjusted predictions — multiply raw model prediction by FDR adjustment factor (FDR 1 = ×1.2, FDR 5 = ×0.7).

•

Rotation penalty — players with <60 min rolling average or flagged as rotation risks have predictions scaled down.

•

Captain/VC ranking — score each outfield player by: fixture-adjusted prediction × form multiplier × home advantage × historical haul frequency....

•

Transfer simulation — for each proposed transfer (out, in, cost), compute expected points delta over next 3 GWs vs. -4pt hit cost. Recommend...

•

Chip recommendations — evaluate chip value:

•

Wildcard: trigger if squad expected points (next 3 GWs) < league average by threshold

•

Bench Boost: trigger if bench players have high expected points (double GW or strong bench fixtures)

•

Triple Captain: trigger if top captain candidate has exceptional fixture (FDR 1, home, great form)

•

Free Hit: trigger on blank/double gameweeks affecting >3 squad players

•

Wildcard/Free Hit builder — when wildcard or free hit is recommended, builds optimal 15-player squad from scratch: enumerate top predicted...

Monte Carlo Simulation (monte_carlo.py)

•

1,000-run simulation of team total points for the upcoming gameweek.

•

Each run: sample each player's points from a distribution (mean = model prediction, std = historical variance for that player/position).

•

Apply starting XI selection (highest expected points, formation constraints).

•

Track captain selection: each run picks the player who scored highest as captain, doubles their score.

•

Output: score distribution histogram, P(score > 60), P(captain haul > 15), P(captain blank < 3).

Live Data Service (live_data_service.py)

•

Background thread refreshing FPL API data every 10 minutes.

•

Caches bootstrap-static and fixtures in memory.

•

Exposes current GW number, deadline, all player stats (injuries, news, availability, price), fixture list.

•

No external API key required — FPL API is public.

Backend API (api.py — FastAPI, 5 endpoints)

•

`GET /health` — health check.

•

`GET /status` — current GW number, deadline timestamp, all fixtures, team names.

•

`GET /players` — all 600+ players with live stats (form, price, injuries, news, predicted points).

•

`POST /analyze-squad` — accepts 15 player IDs + ITB + free transfers + chips available. Returns: optimal starting XI, bench order, captain +...

•

`POST /optimize-team` — full strategy report: all captain candidates ranked, all transfer options evaluated, chip timing analysis, and (if...

Frontend (React 19 + Vite + Tailwind CSS v4)

•

TeamBuilder.jsx — squad input page: search and add players by name, set ITB and free transfers, select available chips. Validates squad (15...

•

Analysis.jsx — full AI analysis dashboard: captain recommendation card, transfer suggestion, chip advice, Monte Carlo chart, fixture difficulty...

•

Pitch.jsx — interactive football pitch with drag-and-drop player positions (@hello-pangea/dnd). Shows injury flags, FDR badges, predicted points,...

•

PredictionChart.jsx — Recharts bar chart of predicted points for all starting XI players.

•

CaptainChart.jsx — Recharts chart ranking top captain candidates by fixture-adjusted prediction score.

•

MonteCarloChart.jsx — Recharts area chart showing 1,000-run score distribution with haul/blank probability markers.

•

TransferImpact.jsx — shows recommended transfer (out → in), expected delta over 3 GWs, hit cost if applicable.

•

FixtureDifficulty.jsx — colour-coded FDR badges (green = easy, red = hard) for next 5 gameweeks.

Testing (176 unit tests, 16 test files)

•

CI-safe: all tests run without a trained model (mock model fixtures).

•

Coverage: - data_pipeline.py: feature engineering, rolling averages, FDR assignment - scoring_engine.py: all 2025/26 scoring rule combinations (GK...

Design Decisions

•

XGBoost as primary model over LinearRegression — handles non-linear feature interactions (e.g., FDR × form × home/away) better than linear models....

•

SGDRegressor for incremental updates — avoids full retraining after each gameweek. `partial_fit` updates model weights from new results in seconds,...

•

Decision engine as a separate module (decision_engine.py) — keeps strategy logic (captain ranking, transfer evaluation, chip timing, wildcard...

•

Monte Carlo simulation over deterministic point ranges — captures the variance of FPL scoring (a player can score 2 or 20 points with the same...

•

Live FPL API with background refresh thread — the official FPL API is public and returns all player/fixture data in two endpoints. Background thread...

•

Greedy wildcard builder over integer programming — sorting by predicted pts / price ratio and applying FPL constraints greedily is fast (~ms) and...

•

176 CI-safe unit tests with mock model — all tests run without a trained model.pkl file (mocked via unittest.mock). Enables CI to validate logic...

•

Two GitHub Actions workflows (CI + CD) — CI on every push/PR (lint + tests + build), CD on main/master (full build + artifact upload). Separation...

•

Recharts over Chart.js/D3 — React-native charting with composable components, good TypeScript support, and sufficient for the 5 chart types needed...

•

@hello-pangea/dnd for pitch drag-and-drop — maintained fork of react-beautiful-dnd, actively supported for React 18+. Provides accessible...

Tradeoffs & Constraints

•

Model accuracy ceiling — XGBoost on rolling averages predicts average form well but can't account for unexpected events (injuries in warm-up,...

•

model.pkl gitignored — model must be generated locally via `make pipeline` before the app runs. Adds a 2-minute setup step. Production deployment...

•

Greedy wildcard builder — near-optimal but not guaranteed optimal. A full integer linear program (pulp or ortools) would guarantee the optimal squad...

•

FPL API rate limits — the official FPL API has undocumented rate limits (~10 req/min per IP). The 10-minute background refresh TTL keeps requests...

•

Monte Carlo variance — 1,000 runs provides stable probability estimates but adds ~200ms to /optimize-team response time. Could be reduced to 500 runs...

•

SGDRegressor as secondary model only — SGD is sensitive to feature scaling and learning rate. Used as a blend component (not primary) to avoid...

•

No persistent database — player predictions and historical analysis stored in-memory and in CSV files. Adding PostgreSQL/Firestore would enable...

•

Would improve: Add LSTM/time-series model for better form capture, implement per-user squad persistence (PostgreSQL), add Slack/WhatsApp alerts for...

Outcome & Impact

•

Full-stack FPL analytics platform with XGBoost predictions, incremental SGDRegressor updates, decision engine, Monte Carlo risk profiling, and React...

•

XGBoost model trained on per-gameweek rolling averages (3-GW, 5-GW points/minutes, FDR, home/away, position, team strength) predicts next-GW points...

•

SGDRegressor incremental trainer updates model weights after each gameweek via partial_fit — adapts to in-season form changes without full...

•

Decision engine covers complete FPL weekly workflow: fixture-adjusted predictions, rotation penalties, captain/VC ranking with confidence scores,...

•

1,000-run Monte Carlo simulation produces team score distribution with haul/blank probability markers — quantifies captain upside and blank gameweek...

•

Live FPL API integration with 10-minute background refresh — all 600+ players with current form, price, injuries, news, and fixture list available...

•

React 19 interactive pitch with drag-and-drop player positioning, injury flags, FDR badges, deadline countdown, and fixture difficulty strip — ...

•

5 FastAPI endpoints cover the full workflow: health, status, player list, squad analysis (starting XI, bench, captain, transfers, chips, Monte...

•

176 pytest unit tests across 16 test files validate all business logic without requiring a trained model — CI runs on every push in ~30s.

•

GitHub Actions CI/CD: ruff lint + 176 unit tests + frontend Vite build on every push/PR; full build + artifact upload on main/master.

Tech Stack

•

ML: XGBoost (primary prediction model), scikit-learn LinearRegression + SGDRegressor (incremental online learning), joblib (model serialisation)

•

Feature Engineering: pandas (per-GW rolling averages, FDR, home/away, form), numpy (numerical operations)

•

Live Data: Official FPL API (bootstrap-static, fixtures, element-summary — public, no auth required), background refresh thread (10-min TTL)

•

Decision Engine: Custom Python — fixture-adjusted predictions, rotation penalties, captain/VC ranking, transfer simulation, chip timing, greedy wildcard builder

•

Simulation: Monte Carlo (1,000-run team score distribution, haul/blank risk profiling)

•

Backend: FastAPI (5 endpoints), Uvicorn, Python 3.11

•

Frontend: React 19, Vite, Tailwind CSS v4, Recharts (bar/area/scatter charts), @hello-pangea/dnd (pitch drag-and-drop), Axios

•

Testing: pytest (176 unit tests, 16 test files, CI-safe with mock model fixtures), unittest.mock

•

CI/CD: GitHub Actions — CI (ruff lint + pytest + frontend build on every push/PR), CD (full build + artifact upload on main/master)

•

Automation: Makefile (setup, pipeline, dev, backend, frontend, test, test-all, lint, build-frontend, clean, all)

Back to Projects