Hypothesis
When the EPL Poisson model disagrees with Bet365 closing odds by more than 5% (|model_p − implied_mkt_p| > 0.05) on the home/away side, the bet generates positive EV after Polymarket's 2% taker fee + 2c half-spread.
Data Used
- Model predictions: EPL Poisson (trained 22/23+23/24, walk-forward tested on 24/25)
- Market proxy: Bet365 closing odds (B365H, B365D, B365A) from football-data.co.uk 24/25 CSV
- Endpoint:
https://www.football-data.co.uk/mmz4281/2425/E0.csv - n = 342 games with valid odds and known teams
- Polymarket comparison markets (live, as of 2026-05-20):
- Arsenal CL:
price=0.43, liq=$989k —GET https://gamma-api.polymarket.com/markets?closed=false&active=true - PSG CL:
price=0.59, liq=$1.16M
Sample bets (edge > 5%):
| Date | Match | Side | model_p | mkt_p | edge | odds | Won | PnL |
|---|---|---|---|---|---|---|---|---|
| 03/05/25 | Arsenal vs Bournemouth | H | 0.839 | 0.464 | +0.375 | 2.05 | No | −1.02 |
| 25/01/25 | Man City vs Chelsea | H | 0.760 | 0.469 | +0.291 | 2.05 | Yes | +0.99 |
| 10/11/24 | Forest vs Newcastle | A | 0.591 | 0.355 | +0.236 | 2.70 | Yes | +1.63 |
| 08/03/25 | Forest vs Man City | A | 0.746 | 0.511 | +0.235 | 1.85 | No | −1.02 |
Method
- Odds → implied prob (vig-adjusted): $p_i^{mkt} = \frac{1/o_i}{\sum_k 1/o_k}$
- Edge: $e_i = p_i^{model} - p_i^{mkt}$
- Bet $1 when $e_i > \theta$ (home or away side, whichever has larger positive edge)
- Net PnL per win: $\text{profit} = (o_i - 1) - o_i \cdot 0.02 - 0.02$ (taker fee on payout + half-spread)
- Net PnL per loss: $-1.02$
Result
| Threshold $\theta$ | N bets | Hit% | ROI (net) | Sharpe |
|---|---|---|---|---|
| 2% | 197 | 40.6% | −14.6% | −0.128 |
| 3% | 180 | 40.6% | −15.5% | −0.137 |
| 5% | 147 | 39.5% | −16.4% | −0.144 |
| 7% | 121 | 39.7% | −16.4% | −0.148 |
| 10% | 100 | 39.0% | −18.4% | −0.167 |
| 15% | 48 | 37.5% | −26.3% | −0.257 |
Verdict: FAIL — negative ROI at every threshold tested. ROI worsens as threshold increases.
CL Final comparison (high-uncertainty illustration only): - Model (Man City as PSG proxy, neutral ground): Arsenal 0.462, PSG 0.538 - Market: Arsenal 0.43, PSG 0.59 → model slightly prefers Arsenal vs market - This comparison is UNRELIABLE: Man City ≠ PSG; no UCL form data; proxy error dominates
Reproduction
source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_backtest_vs_market.py
# Output: /tmp/pm_data/backtest_results.json
# Plot: /tmp/pm_data/d5_calibration_plot.png
Failure Mode / Next Step
Why it fails: 1. The model's largest edges are on strong home teams (Man City, Arsenal, Tottenham), precisely where calibration is worst — bucket 7–8 actual rate 0.52–0.61 vs model 0.69–0.76. 2. Bet365 already correctly discounts Man City/Tottenham in 24/25 (recognizing their form drop); the model doesn't adapt. 3. Higher thresholds correlate with larger bet sizes on the miscalibrated region → PnL worsens.
To fix: - Season-to-date rolling update: refit model every N matchweeks using only current-season data after GW8+ - Add a form feature: last-5-game xG or points as a covariate - Use Bayesian updating: shrink priors toward current-season results as the season progresses - For Polymarket specifically: EPL game-level markets are rare (mostly seasonal/futures); UCL Final is the best live target but requires Ligue 1 data for PSG ratings