[polymarket][sports] EPL model trade rule vs bookmaker (>5% edge)

[polymarket][sports] EPL model trade rule vs bookmaker (>5% edge) — FAIL

active

polymarketsportsepltrade-rulebacktestfail Priority: 3 Source: polymarket-sports Created: 2026-05-20 Updated: 2026-05-20

Hypothesis

When the EPL Poisson model disagrees with Bet365 closing odds by more than 5% (|model_p − implied_mkt_p| > 0.05) on the home/away side, the bet generates positive EV after Polymarket's 2% taker fee + 2c half-spread.

Data Used

Model predictions: EPL Poisson (trained 22/23+23/24, walk-forward tested on 24/25)
Market proxy: Bet365 closing odds (B365H, B365D, B365A) from football-data.co.uk 24/25 CSV
Endpoint: https://www.football-data.co.uk/mmz4281/2425/E0.csv
n = 342 games with valid odds and known teams
Polymarket comparison markets (live, as of 2026-05-20):
Arsenal CL: price=0.43, liq=$989k — GET https://gamma-api.polymarket.com/markets?closed=false&active=true
PSG CL: price=0.59, liq=$1.16M

Sample bets (edge > 5%):

Date	Match	Side	model_p	mkt_p	edge	odds	Won	PnL
03/05/25	Arsenal vs Bournemouth	H	0.839	0.464	+0.375	2.05	No	−1.02
25/01/25	Man City vs Chelsea	H	0.760	0.469	+0.291	2.05	Yes	+0.99
10/11/24	Forest vs Newcastle	A	0.591	0.355	+0.236	2.70	Yes	+1.63
08/03/25	Forest vs Man City	A	0.746	0.511	+0.235	1.85	No	−1.02

Method

Odds → implied prob (vig-adjusted): $p_i^{mkt} = \frac{1/o_i}{\sum_k 1/o_k}$
Edge: $e_i = p_i^{model} - p_i^{mkt}$
Bet $1 when $e_i > \theta$ (home or away side, whichever has larger positive edge)
Net PnL per win: $\text{profit} = (o_i - 1) - o_i \cdot 0.02 - 0.02$ (taker fee on payout + half-spread)
Net PnL per loss: $-1.02$

Result

Threshold $\theta$	N bets	Hit%	ROI (net)	Sharpe
2%	197	40.6%	−14.6%	−0.128
3%	180	40.6%	−15.5%	−0.137
5%	147	39.5%	−16.4%	−0.144
7%	121	39.7%	−16.4%	−0.148
10%	100	39.0%	−18.4%	−0.167
15%	48	37.5%	−26.3%	−0.257

Verdict: FAIL — negative ROI at every threshold tested. ROI worsens as threshold increases.

CL Final comparison (high-uncertainty illustration only): - Model (Man City as PSG proxy, neutral ground): Arsenal 0.462, PSG 0.538 - Market: Arsenal 0.43, PSG 0.59 → model slightly prefers Arsenal vs market - This comparison is UNRELIABLE: Man City ≠ PSG; no UCL form data; proxy error dominates

Reproduction

source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_backtest_vs_market.py
# Output: /tmp/pm_data/backtest_results.json
# Plot:   /tmp/pm_data/d5_calibration_plot.png

Failure Mode / Next Step

Why it fails: 1. The model's largest edges are on strong home teams (Man City, Arsenal, Tottenham), precisely where calibration is worst — bucket 7–8 actual rate 0.52–0.61 vs model 0.69–0.76. 2. Bet365 already correctly discounts Man City/Tottenham in 24/25 (recognizing their form drop); the model doesn't adapt. 3. Higher thresholds correlate with larger bet sizes on the miscalibrated region → PnL worsens.

To fix: - Season-to-date rolling update: refit model every N matchweeks using only current-season data after GW8+ - Add a form feature: last-5-game xG or points as a covariate - Use Bayesian updating: shrink priors toward current-season results as the season progresses - For Polymarket specifically: EPL game-level markets are rare (mostly seasonal/futures); UCL Final is the best live target but requires Ligue 1 data for PSG ratings

Edit this Idea

Title * Body (Markdown)

## Hypothesis
When the EPL Poisson model disagrees with Bet365 closing odds by more than 5% (|model_p − implied_mkt_p| > 0.05) on the home/away side, the bet generates positive EV after Polymarket's 2% taker fee + 2c half-spread.

## Data Used
- **Model predictions**: EPL Poisson (trained 22/23+23/24, walk-forward tested on 24/25)
- **Market proxy**: Bet365 closing odds (B365H, B365D, B365A) from football-data.co.uk 24/25 CSV
  - Endpoint: `https://www.football-data.co.uk/mmz4281/2425/E0.csv`
- **n = 342 games** with valid odds and known teams
- **Polymarket comparison markets** (live, as of 2026-05-20):
  - Arsenal CL: `price=0.43`, liq=$989k — `GET https://gamma-api.polymarket.com/markets?closed=false&active=true`
  - PSG CL: `price=0.59`, liq=$1.16M

Sample bets (edge > 5%):

| Date | Match | Side | model_p | mkt_p | edge | odds | Won | PnL |
|---|---|---|---|---|---|---|---|---|
| 03/05/25 | Arsenal vs Bournemouth | H | 0.839 | 0.464 | +0.375 | 2.05 | No | −1.02 |
| 25/01/25 | Man City vs Chelsea | H | 0.760 | 0.469 | +0.291 | 2.05 | Yes | +0.99 |
| 10/11/24 | Forest vs Newcastle | A | 0.591 | 0.355 | +0.236 | 2.70 | Yes | +1.63 |
| 08/03/25 | Forest vs Man City | A | 0.746 | 0.511 | +0.235 | 1.85 | No | −1.02 |

## Method
1. Odds → implied prob (vig-adjusted): $p_i^{mkt} = \frac{1/o_i}{\sum_k 1/o_k}$
2. Edge: $e_i = p_i^{model} - p_i^{mkt}$
3. Bet $1 when $e_i > \theta$ (home or away side, whichever has larger positive edge)
4. Net PnL per win: $\text{profit} = (o_i - 1) - o_i \cdot 0.02 - 0.02$ (taker fee on payout + half-spread)
5. Net PnL per loss: $-1.02$

## Result

| Threshold $\theta$ | N bets | Hit% | ROI (net) | Sharpe |
|---|---|---|---|---|
| 2% | 197 | 40.6% | **−14.6%** | −0.128 |
| 3% | 180 | 40.6% | **−15.5%** | −0.137 |
| **5%** | **147** | **39.5%** | **−16.4%** | **−0.144** |
| 7% | 121 | 39.7% | −16.4% | −0.148 |
| 10% | 100 | 39.0% | −18.4% | −0.167 |
| 15% | 48 | 37.5% | −26.3% | −0.257 |

**Verdict: FAIL — negative ROI at every threshold tested. ROI worsens as threshold increases.**

**CL Final comparison** (high-uncertainty illustration only):
- Model (Man City as PSG proxy, neutral ground): Arsenal 0.462, PSG 0.538
- Market: Arsenal 0.43, PSG 0.59 → model slightly prefers Arsenal vs market
- **This comparison is UNRELIABLE**: Man City ≠ PSG; no UCL form data; proxy error dominates

## Reproduction
```bash
source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d5_backtest_vs_market.py
# Output: /tmp/pm_data/backtest_results.json
# Plot:   /tmp/pm_data/d5_calibration_plot.png
```

## Failure Mode / Next Step
**Why it fails**:
1. The model's largest edges are on strong home teams (Man City, Arsenal, Tottenham), precisely where calibration is worst — bucket 7–8 actual rate 0.52–0.61 vs model 0.69–0.76.
2. Bet365 already correctly discounts Man City/Tottenham in 24/25 (recognizing their form drop); the model doesn't adapt.
3. Higher thresholds correlate with larger bet sizes on the miscalibrated region → PnL worsens.

**To fix**: 
- Season-to-date rolling update: refit model every N matchweeks using only current-season data after GW8+
- Add a form feature: last-5-game xG or points as a covariate
- Use Bayesian updating: shrink priors toward current-season results as the season progresses
- For Polymarket specifically: EPL game-level markets are rare (mostly seasonal/futures); UCL Final is the best live target but requires Ligue 1 data for PSG ratings

Tags (comma-separated) Status Priority (0-5) Source