Hypothesis
A mid-price–crossing fill model replaying Polymarket CLOB prices-history can estimate symmetric MM PnL (spread capture minus adverse selection) accurately enough to rank parameterisations.
Data used
- Endpoint:
GET https://clob.polymarket.com/prices-history?market=<token_id>&interval=max&fidelity=1 - Markets: BTC-hit-$1M (
105267...5810), PSG-CL (104259...0834), Colorado-NHL (101738...9479) - Sample size: ~4,260 observations per market, 2026-04-20 → 2026-05-20 (30 days)
- Median inter-observation interval: 600 s (~10 min); range 33 s – 3,042 s (irregular)
- Sample rows (btc_1m): | t (unix) | p | |---|---| | 1776718843 | 0.4915 | | 1776719425 | 0.4920 | | 1776720037 | 0.4910 | | 1776721011 | 0.4915 | | 1776722608 | 0.4905 |
Method
At each observation $t$ with mid-price $m_t$, post:
$$b_t = \max(0.001,\, m_t - w/2), \quad a_t = \min(0.999,\, m_t + w/2)$$
Fill rule (crossing-price approximation): - Bid fill if $m_{t+1} \le b_t$: buy $d / m_t$ tokens, pay $b_t$ per token - Ask fill if $m_{t+1} \ge a_t$: sell $d / m_t$ tokens, receive $a_t$ per token
MTM PnL at each step: $$\text{PnL}_t = \text{cash}_t + \text{inventory}_t \times m_t$$
Sharpe annualised using time-weighted increments (actual $\Delta t$ in hours): $$S = \frac{\mu(\Delta\text{PnL}/\sqrt{\Delta t})}{\sigma(\Delta\text{PnL}/\sqrt{\Delta t})} \times \sqrt{8760}$$
Fees: Polymarket CLOB maker = 0% (confirmed); taker = 2% (borne by counterparty, not us).
Result
Harness runs end-to-end. Grid sweep (7 widths × 3 depths × 3 markets = 63 configs) completes in <5 s. BTC $1M best config: width=0.002, depth=$200 → PnL=$19.91/30d, Sharpe=0.971, 120 fills.
Reproduction
source ~/.pmvenv/bin/activate
python3 /home/workspace/pm_mm_backtest.py --sweep
# Sweep results at /tmp/pm_data/sweep_results.csv
Data snapshots: /tmp/pm_data/{btc_1m,psg_cl,colo_nhl}_prices_f1.json
Failure mode / next step
Critical fill model bias: The crossing-price rule overestimates fills — within a 10-min bar, price may cross our quote and revert without a real taker touching our level. Mitigation: use actual CLOB trade tape (unavailable without auth key) or an intra-bar volatility correction (scale fill prob by $\text{erf}(w / (2\sigma_{\text{bar}}))$). Queue position is also not modeled — real fills compete with other makers at the same price level.