Hypothesis
Polymarket prices on Iran-linked outcome markets exhibit statistically distinguishable cumulative abnormal returns (CAR) in the ±30-hour window around publicly reported geopolitical news shocks, versus a pre-event drift baseline.
Data used
- Endpoints:
https://clob.polymarket.com/prices-history?market=<token_id>&interval=max&fidelity=60 - Markets: | Market slug | Token ID (YES) | Volume | |---|---|---| | will-bitcoin-hit-150k-by-june-30-2026 | 13915689...586 | \$15.7M | | will-the-iranian-regime-fall-by-may-31 | 57360053...996 | \$24.3M | | will-trump-say-iran-during-events-with-xi-jinping | 31868121...368 | \$13.6M | | iran-closes-its-airspace-by-may-21 | 25765184...465 | \$2.8M |
- Date range: 2026-04-20 to 2026-05-20 (hourly candles via fidelity=60)
- N events: 6 (manually identified from news search)
- Sample rows (btc_150k):
{t:1776718834, p:0.0115}...{t:1779304326, p:0.0105}
Method
For each event $e$ at time $t_0$: $$\text{baseline_drift} = \frac{1}{T_{pre}}\sum_{t=t_0-30}^{t_0-1} r_t$$ $$\text{CAR}{30h} = \frac{p{t_0+30} - p_{t_0-1}}{p_{t_0-1}} - 30 \cdot \text{baseline_drift}$$ $$\text{signed_CAR} = \text{direction}(e) \times \text{CAR}_{30h}$$ where $\text{direction}(e) \in {+1,-1}$ encodes the theoretically expected sign from news classification.
Sign test: $H_0$: P(signed CAR > 0) = 0.5. t-test on 30h signed CAR mean.
Result
| Event | Market | p_pre | p_t0 | p_post30h | CAR% | signed_CAR% | t-stat | p-val |
|---|---|---|---|---|---|---|---|---|
| E1 Ceasefire extension | btc_150k | 0.0145 | 0.0145 | 0.0240 | +34.1 | +34.1 | 1.03 | 0.30 |
| E2 Ceasefire extension | iran_regime_fall | 0.0455 | 0.0450 | 0.0330 | -72.4 | +72.4 | -1.83 | 0.07 |
| E3 Axios one-page memo | iran_regime_fall | 0.0245 | 0.0235 | 0.0215 | -6.4 | +6.4 | -0.29 | 0.77 |
| E4 Trump bomb threat | iran_regime_fall | 0.0205 | 0.0205 | 0.0215 | +22.1 | +22.1 | 1.39 | 0.17 |
| E5 Trump-Xi summit | trump_say_iran_xi | 0.6300 | 0.5400 | 0.7455 | +38.1 | +38.1 | 1.54 | 0.12 |
| E6 Flights resume Tehran | iran_airspace | 0.1800 | 0.1150 | 0.1550 | -11.2 | +11.2 | -0.36 | 0.72 |
Aggregate (30h): Mean signed CAR = +30.7%, SD = 23.9%, hit rate = 6/6, sign test p = 0.0156 6h window: Mean signed = +17.6%, SD = 43.2%, hit rate = 4/6, t-stat = 1.00, p = 0.36
Verdict: INCONCLUSIVE — n=6 < 30 threshold; individual events not statistically significant. Sign test is technically significant (p=0.016) but with only 6 observations this is fragile.
Reproduction
source ~/.pmvenv/bin/activate
python3 /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac/d3_event_study.py
# Data snapshot: /tmp/pm_data/prices_history.json
# Results: /tmp/pm_data/event_study_results.csv
Failure mode / next step
- Primary break: n=6 is too small; any single event distorts aggregate stats
- Measurement error: hourly fidelity misses the first-mover advantage window (critical 0–30 min after headline)
- Look-ahead risk: event timestamps were estimated from news article timestamps, not intra-day tick confirmation
- Next step: pull minute-fidelity data during known news windows (fidelity=1), expand to 20+ events using GDELT for Iran-war headlines since Feb 2026