[polymarket][sports] EPL Poisson model calibration — PASS (beats uniform) / FAIL (overconfident on favorites)

active

polymarketsportseplpoissoncalibrationpass Priority: 3 Source: polymarket-sports Created: 2026-05-20 Updated: 2026-05-20

Hypothesis

A Dixon-Coles Poisson goal model trained on 2 prior EPL seasons will beat a uniform 1/3 prior on 1x2 outcome prediction for the next season, measured by log-loss and Brier score.

Data Used

Source: football-data.co.uk CSVs (free, no key)
/tmp/pm_data/epl/E0_2223.csv — 380 games
/tmp/pm_data/epl/E0_2324.csv — 380 games
/tmp/pm_data/epl/E0_2425.csv — 380 games
Train: 22/23 + 23/24 (760 games, 23 teams)
Test: 24/25 (342 games after filtering relegated teams)
Walk-forward: model trained before any test data — no look-ahead

Sample rows (test set):

Date	Home	Away	FTHG	FTAG	FTR	model_h	model_a
16/08/2024	Man United	Fulham	1	0	H	0.615	0.173
17/08/2024	Arsenal	Wolves	3	0	H	0.812	0.067
17/08/2024	Everton	Brighton	0	3	A	0.314	0.376
17/08/2024	Newcastle	Southampton	1	0	H	0.676	0.143
17/08/2024	West Ham	Aston Villa	3	1	H	0.344	0.340

Method

Model: Independent Poisson for home goals $g_h$ and away goals $g_a$:

$$\lambda_h = \exp(\alpha_i + \delta_j + \eta), \quad \lambda_a = \exp(\alpha_j + \delta_i)$$

where $\alpha_i$ = attack strength, $\delta_i$ = defense weakness (signed), $\eta$ = HFA (log scale).

Fit: MLE via L-BFGS-B minimizing negative log-likelihood: $$\mathcal{L} = -\sum_k [\log P(g_h^k|\lambda_h^k) + \log P(g_a^k|\lambda_a^k)]$$

Outcomes: Convolve joint Poisson distribution over $8\times8$ score grid: $$P(H) = \sum_{g_h > g_a} \text{Poisson}(g_h|\lambda_h)\cdot\text{Poisson}(g_a|\lambda_a)$$

Identifiability: 23 attack + 23 defense params + 1 HFA = 47 params.

Result

Metric	Model	Baseline (uniform 1/3)
Brier Score (home win)	0.2219	0.2500 (naive)
Brier Score (away win)	0.2058	0.2500
Mean Log-loss (3-way)	1.0193	1.0986
Skill vs uniform	+7.2%	—
HFA (multiplicative)	1.273x	—

Top team ratings (attack − defense, log scale): - Man City: +1.075 | Arsenal: +0.950 | Liverpool: +0.658 | Newcastle: +0.537

Calibration (home win, 10 buckets):

Bucket	mean_pred	mean_actual	n
0	0.116	0.188	16
1	0.198	0.069	29
2	0.285	0.325	40
3	0.358	0.327	55
4	0.443	0.490	51
5	0.517	0.457	46
6	0.604	0.532	47
7	0.686	0.522	23
8	0.761	0.609	23
9	0.846	0.917	12

Key miscalibration: Buckets 7–8 (strong home favorites, model_p ~0.70–0.76) see actual rates of only 0.52–0.61 — model is overconfident on strong home teams. This is the primary failure mode.

Reproduction

source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_epl_poisson.py
# Data: /tmp/pm_data/epl/E0_{2223,2324,2425}.csv
# Output: /tmp/pm_data/model_results.json

Failure Mode / Next Step

Model overestimates strong home favorites (Man City, Arsenal away collapses vs model expectation in 24/25). Root cause: 24/25 was an unusual season — Man City and Tottenham regressed sharply from their 22/23+23/24 form; a static cross-season model can't adapt.
Fix: Add a time-decay weight on training data (recent matches weighted higher), or use a rolling Elo update. Alternatively use within-season rolling refit after GW10.
No Dixon-Coles low-score correlation correction implemented; this is a known gap for 0-0 and 1-0 frequency.

Edit this Idea

Title * Body (Markdown)

## Hypothesis
A Dixon-Coles Poisson goal model trained on 2 prior EPL seasons will beat a uniform 1/3 prior on 1x2 outcome prediction for the next season, measured by log-loss and Brier score.

## Data Used
- **Source**: football-data.co.uk CSVs (free, no key)  
  - `/tmp/pm_data/epl/E0_2223.csv` — 380 games  
  - `/tmp/pm_data/epl/E0_2324.csv` — 380 games  
  - `/tmp/pm_data/epl/E0_2425.csv` — 380 games  
- **Train**: 22/23 + 23/24 (760 games, 23 teams)  
- **Test**: 24/25 (342 games after filtering relegated teams)  
- **Walk-forward**: model trained before any test data — no look-ahead

Sample rows (test set):

| Date | Home | Away | FTHG | FTAG | FTR | model_h | model_a |
|---|---|---|---|---|---|---|---|
| 16/08/2024 | Man United | Fulham | 1 | 0 | H | 0.615 | 0.173 |
| 17/08/2024 | Arsenal | Wolves | 3 | 0 | H | 0.812 | 0.067 |
| 17/08/2024 | Everton | Brighton | 0 | 3 | A | 0.314 | 0.376 |
| 17/08/2024 | Newcastle | Southampton | 1 | 0 | H | 0.676 | 0.143 |
| 17/08/2024 | West Ham | Aston Villa | 3 | 1 | H | 0.344 | 0.340 |

## Method
**Model**: Independent Poisson for home goals $g_h$ and away goals $g_a$:

$$\lambda_h = \exp(\alpha_i + \delta_j + \eta), \quad \lambda_a = \exp(\alpha_j + \delta_i)$$

where $\alpha_i$ = attack strength, $\delta_i$ = defense weakness (signed), $\eta$ = HFA (log scale).

**Fit**: MLE via L-BFGS-B minimizing negative log-likelihood:
$$\mathcal{L} = -\sum_k [\log P(g_h^k|\lambda_h^k) + \log P(g_a^k|\lambda_a^k)]$$

**Outcomes**: Convolve joint Poisson distribution over $8\times8$ score grid:
$$P(H) = \sum_{g_h > g_a} \text{Poisson}(g_h|\lambda_h)\cdot\text{Poisson}(g_a|\lambda_a)$$

**Identifiability**: 23 attack + 23 defense params + 1 HFA = 47 params.

## Result

| Metric | Model | Baseline (uniform 1/3) |
|---|---|---|
| Brier Score (home win) | **0.2219** | 0.2500 (naive) |
| Brier Score (away win) | **0.2058** | 0.2500 |
| Mean Log-loss (3-way) | **1.0193** | 1.0986 |
| Skill vs uniform | **+7.2%** | — |
| HFA (multiplicative) | **1.273x** | — |

**Top team ratings** (attack − defense, log scale):
- Man City: +1.075 | Arsenal: +0.950 | Liverpool: +0.658 | Newcastle: +0.537

**Calibration (home win, 10 buckets)**:

| Bucket | mean_pred | mean_actual | n |
|---|---|---|---|
| 0 | 0.116 | 0.188 | 16 |
| 1 | 0.198 | 0.069 | 29 |
| 2 | 0.285 | 0.325 | 40 |
| 3 | 0.358 | 0.327 | 55 |
| 4 | 0.443 | 0.490 | 51 |
| 5 | 0.517 | 0.457 | 46 |
| 6 | 0.604 | 0.532 | 47 |
| 7 | 0.686 | 0.522 | 23 |
| 8 | 0.761 | 0.609 | 23 |
| 9 | 0.846 | 0.917 | 12 |

**Key miscalibration**: Buckets 7–8 (strong home favorites, model_p ~0.70–0.76) see actual rates of only 0.52–0.61 — model is **overconfident on strong home teams**. This is the primary failure mode.

## Reproduction
```bash
source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_epl_poisson.py
# Data: /tmp/pm_data/epl/E0_{2223,2324,2425}.csv
# Output: /tmp/pm_data/model_results.json
```

## Failure Mode / Next Step
- Model overestimates strong home favorites (Man City, Arsenal away collapses vs model expectation in 24/25). Root cause: 24/25 was an unusual season — Man City and Tottenham regressed sharply from their 22/23+23/24 form; a static cross-season model can't adapt.
- **Fix**: Add a time-decay weight on training data (recent matches weighted higher), or use a rolling Elo update. Alternatively use within-season rolling refit after GW10.
- No Dixon-Coles low-score correlation correction implemented; this is a known gap for 0-0 and 1-0 frequency.

Tags (comma-separated) Status Priority (0-5) Source