Football Vision AI
Real-time football player detection, team classification, player tracking, match statistics, and heatmap generation — powered by YOLOv8, DeepSort, and K-Means clustering. FastAPI backend with image and streaming NDJSON video endpoints.
Role
ML Engineer & Full-stack Developer
Team
Solo
Company/Organization
Personal Project
The Problem
Automated football video analysis required combining object detection, multi-object tracking, team classification, and statistical aggregation — ...
Jersey-color-based team classification was unstable: re-running K-Means each frame caused cluster assignments to randomly flip, assigning the same...
DeepSort tracking required integrating YOLO bounding boxes with the tracker's expected format and maintaining stable IDs across frames with...
Match statistics (possession %, pitch coverage %, attacking third %) required aggregating per-frame detection data consistently, including...
Streaming annotated video analysis required a non-blocking API design — processing a 90-minute video synchronously would time out any HTTP client.
No open-source project combined all of: YOLOv8 inference, stateful team classification, DeepSort tracking, statistics engine, heatmap generation,...
The Solution
Built an end-to-end football vision pipeline with stateful team classification and streaming API.
Detection Pipeline (detect.py)
YOLOv8 Inference — Ultralytics YOLOv8 model (custom-trained or falls back to yolov8n.pt). Detects 4 classes: player (0), ball (1), goalkeeper...
Team Classification (team_classifier.py) — Stateful jersey-color classifier:
- First frame: extract torso crop from each player bounding box, compute mean RGB colour, run K-Means (k=2) to initialise two cluster centroids.
- Subsequent frames: assign each player to the nearest centroid (Euclidean distance in RGB space) — no re-clustering. Centroids locked for the entire...
- Goalkeepers and referees classified separately to avoid polluting team centroids.
Player Tracking (tracker.py) — DeepSort multi-object tracker. Converts YOLO bounding boxes to DeepSort format, updates tracker each frame, maps...
Statistics Engine (stats_engine.py) — Per-frame aggregation:
- Player counts: median-capped at 11 per team to suppress spurious detections.
- Ball possession: assigned to the team with the nearest player to the ball centroid.
- Pitch coverage %: bounding box area of all players / total frame area.
- Attacking third %: players in the attacking third of the pitch.
- Possession timeline: per-bucket (every N frames) possession chart data.
Heatmap Generation (heatmap.py) — Accumulates player centroid positions over all frames, renders per-team and combined position heatmaps on a...
API Backend (api.py — FastAPI)
`GET /` — Health check.
`POST /detect/image` — Synchronous: upload image → YOLOv8 inference + team classification → return detections JSON + stats + annotated image filename.
`POST /detect/video` — Streaming: upload video → process frame-by-frame → stream NDJSON (one JSON object per line) with per-frame progress, then...
`GET /heatmaps/{filename}` — Serve generated heatmap PNG.
`GET /outputs/{filename}` — Serve annotated video (MP4/H.264) or annotated image (JPG).
Auto-cleanup: deletes output files older than 24 hours on server startup.
Frontend (React 18 + Tailwind CSS v4)
Upload.js (Detection Studio) — File upload widget (image or video), calls appropriate API endpoint, shows streaming progress bar for video via...
Results display — Annotated image/video preview, detection table, stats panel (possession %, pitch coverage, attacking third), possession...
Download buttons — One-click download for annotated video, annotated images, and heatmap PNGs.
Dark sports-analytics UI — Tailwind CSS v4 dark theme styled for a sports data platform.
Infrastructure
Docker Compose: two services (frontend nginx on :3000, backend FastAPI + Uvicorn on :8000). Backend container includes ffmpeg for H.264 video...
GitHub Actions CI/CD: lint (ruff), syntax check, build, project structure validation.
Makefile: 14 commands (install, dev, backend, frontend, test, lint, train, download, detect, heatmaps, kill, clean, docker-up, docker-down).
Training pipeline (train.py): downloads football dataset from Kaggle, converts to YOLO format, trains YOLOv8 for ~50 epochs.
Design Decisions
Stateful K-Means team classifier — centroids initialised on the first frame and locked for the entire video. Nearest-centroid assignment (no...
Streaming NDJSON for video endpoint — processing long videos synchronously would timeout HTTP clients. NDJSON stream sends per-frame progress JSON...
Separate goalkeeper/referee classification — goalkeepers and referees are detected as distinct YOLO classes. Their jersey colours are excluded from...
Median-capped player counts (max 11 per team) — YOLOv8 occasionally produces spurious detections. Capping at 11 (max football players per team)...
Nearest-ball possession assignment — ball possession attributed to the team whose player centroid is closest to the ball detection centroid each...
Auto-cleanup of outputs on startup — files older than 24 hours are deleted when the server starts. Prevents disk exhaustion on long-running...
Falls back to yolov8n.pt if no custom model found — makes the system runnable out-of-the-box without training. Custom-trained model improves accuracy...
ffmpeg inside Docker container — H.264 re-encoding of annotated video handled inside the backend container, keeping the output compatible with all...
Tradeoffs & Constraints
K-Means team classifier assumes two teams with visually distinct jersey colours — struggles with similar colours (e.g., both teams in white/light...
DeepSort tracking degrades with heavy occlusion and fast camera movement — IDs can be lost and re-assigned. A more robust tracker (ByteTrack,...
NDJSON streaming requires the client to parse a ReadableStream line-by-line — more complex than a standard JSON response. Worth the complexity for...
Stats engine uses frame-level aggregation without temporal smoothing — individual noisy frames can affect per-bucket possession charts. A...
Custom YOLOv8 model requires Kaggle API credentials and ~50 epochs of training (~hours on GPU) — without training, the system falls back to...
Auto-cleanup at startup only — outputs generated during a long server session may accumulate. A background thread with periodic cleanup would be more...
CORS currently allows all origins (*) — acceptable for local development but must be restricted to the frontend origin in production.
Would improve: Add ByteTrack for more robust player re-identification, implement fine-tuned jersey colour classifier for similar-coloured kits, add...
Outcome & Impact
Production-ready football vision system combining YOLOv8 detection, stateful K-Means team classification, DeepSort tracking, and statistics engine...
Stateful jersey-color classifier with centroid locking solves the team identity flipping problem — consistent team assignment from first frame to...
Streaming NDJSON video endpoint enables real-time progress reporting for long video files — client receives per-frame progress updates via...
Match statistics engine produces possession %, pitch coverage %, attacking third %, and per-bucket possession timeline — all derived from YOLOv8...
Per-team and combined heatmaps on a rendered football pitch visualise player positioning patterns across the full match duration.
React dark sports-analytics dashboard with annotated video preview, detection table, stats panel, possession timeline chart, heatmap display, and...
Docker Compose production deployment with nginx frontend and FastAPI backend — backend includes ffmpeg for H.264 output compatibility.
GitHub Actions CI/CD pipeline validates lint (ruff), syntax, build, and project structure on every push.
Makefile with 14 commands covers full development lifecycle: install, dev, test, lint, train, download dataset, detect, view heatmaps, kill servers,...
Tech Stack
Detection: Python 3.11, Ultralytics YOLOv8 (players, goalkeepers, referees, ball — conf ≥ 0.5, IoU 0.4)
Tracking: DeepSort (multi-object tracker, stable per-player IDs across frames)
Team Classification: scikit-learn K-Means (stateful centroid locking), OpenCV (torso crop extraction, RGB colour)
Statistics: Custom stats engine (possession %, pitch coverage %, attacking third %, possession timeline)
Heatmaps: Matplotlib (per-team + combined heatmaps on rendered football pitch)
Video Processing: OpenCV, imageio-ffmpeg (H.264 re-encoding), ffmpeg (in Docker container)
Backend: FastAPI, Uvicorn, NDJSON streaming (video endpoint), auto-cleanup (24h output files)
Frontend: React 18, Tailwind CSS v4, Axios, React Router v6, ReadableStream (NDJSON parsing)
Infrastructure: Docker, Docker Compose, nginx (SPA + API proxy), GitHub Actions CI/CD (ruff lint, build, structure validation)
Automation: Makefile (14 commands: install, dev, test, lint, train, download, detect, heatmaps, kill, clean, docker-up, docker-down)