Hypothesis
A Dixon-Coles Poisson goal model trained on 2 prior EPL seasons will beat a uniform 1/3 prior on 1x2 outcome prediction for the next season, measured by log-loss and Brier score.
Data Used
- Source: football-data.co.uk CSVs (free, no key)
/tmp/pm_data/epl/E0_2223.csv— 380 games/tmp/pm_data/epl/E0_2324.csv— 380 games/tmp/pm_data/epl/E0_2425.csv— 380 games- Train: 22/23 + 23/24 (760 games, 23 teams)
- Test: 24/25 (342 games after filtering relegated teams)
- Walk-forward: model trained before any test data — no look-ahead
Sample rows (test set):
| Date | Home | Away | FTHG | FTAG | FTR | model_h | model_a |
|---|---|---|---|---|---|---|---|
| 16/08/2024 | Man United | Fulham | 1 | 0 | H | 0.615 | 0.173 |
| 17/08/2024 | Arsenal | Wolves | 3 | 0 | H | 0.812 | 0.067 |
| 17/08/2024 | Everton | Brighton | 0 | 3 | A | 0.314 | 0.376 |
| 17/08/2024 | Newcastle | Southampton | 1 | 0 | H | 0.676 | 0.143 |
| 17/08/2024 | West Ham | Aston Villa | 3 | 1 | H | 0.344 | 0.340 |
Method
Model: Independent Poisson for home goals $g_h$ and away goals $g_a$:
$$\lambda_h = \exp(\alpha_i + \delta_j + \eta), \quad \lambda_a = \exp(\alpha_j + \delta_i)$$
where $\alpha_i$ = attack strength, $\delta_i$ = defense weakness (signed), $\eta$ = HFA (log scale).
Fit: MLE via L-BFGS-B minimizing negative log-likelihood: $$\mathcal{L} = -\sum_k [\log P(g_h^k|\lambda_h^k) + \log P(g_a^k|\lambda_a^k)]$$
Outcomes: Convolve joint Poisson distribution over $8\times8$ score grid: $$P(H) = \sum_{g_h > g_a} \text{Poisson}(g_h|\lambda_h)\cdot\text{Poisson}(g_a|\lambda_a)$$
Identifiability: 23 attack + 23 defense params + 1 HFA = 47 params.
Result
| Metric | Model | Baseline (uniform 1/3) |
|---|---|---|
| Brier Score (home win) | 0.2219 | 0.2500 (naive) |
| Brier Score (away win) | 0.2058 | 0.2500 |
| Mean Log-loss (3-way) | 1.0193 | 1.0986 |
| Skill vs uniform | +7.2% | — |
| HFA (multiplicative) | 1.273x | — |
Top team ratings (attack − defense, log scale): - Man City: +1.075 | Arsenal: +0.950 | Liverpool: +0.658 | Newcastle: +0.537
Calibration (home win, 10 buckets):
| Bucket | mean_pred | mean_actual | n |
|---|---|---|---|
| 0 | 0.116 | 0.188 | 16 |
| 1 | 0.198 | 0.069 | 29 |
| 2 | 0.285 | 0.325 | 40 |
| 3 | 0.358 | 0.327 | 55 |
| 4 | 0.443 | 0.490 | 51 |
| 5 | 0.517 | 0.457 | 46 |
| 6 | 0.604 | 0.532 | 47 |
| 7 | 0.686 | 0.522 | 23 |
| 8 | 0.761 | 0.609 | 23 |
| 9 | 0.846 | 0.917 | 12 |
Key miscalibration: Buckets 7–8 (strong home favorites, model_p ~0.70–0.76) see actual rates of only 0.52–0.61 — model is overconfident on strong home teams. This is the primary failure mode.
Reproduction
source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_epl_poisson.py
# Data: /tmp/pm_data/epl/E0_{2223,2324,2425}.csv
# Output: /tmp/pm_data/model_results.json
Failure Mode / Next Step
- Model overestimates strong home favorites (Man City, Arsenal away collapses vs model expectation in 24/25). Root cause: 24/25 was an unusual season — Man City and Tottenham regressed sharply from their 22/23+23/24 form; a static cross-season model can't adapt.
- Fix: Add a time-decay weight on training data (recent matches weighted higher), or use a rolling Elo update. Alternatively use within-season rolling refit after GW10.
- No Dixon-Coles low-score correlation correction implemented; this is a known gap for 0-0 and 1-0 frequency.