[polymarket][mm] CLOB MM Backtest Harness — PASS

active
polymarketmmbacktestharnesspass   Priority: 4   Source: polymarket-mm   Created: 2026-05-20   Updated: 2026-05-20

Hypothesis

A mid-price–crossing fill model replaying Polymarket CLOB prices-history can estimate symmetric MM PnL (spread capture minus adverse selection) accurately enough to rank parameterisations.

Data used

Method

At each observation $t$ with mid-price $m_t$, post:

$$b_t = \max(0.001,\, m_t - w/2), \quad a_t = \min(0.999,\, m_t + w/2)$$

Fill rule (crossing-price approximation): - Bid fill if $m_{t+1} \le b_t$: buy $d / m_t$ tokens, pay $b_t$ per token - Ask fill if $m_{t+1} \ge a_t$: sell $d / m_t$ tokens, receive $a_t$ per token

MTM PnL at each step: $$\text{PnL}_t = \text{cash}_t + \text{inventory}_t \times m_t$$

Sharpe annualised using time-weighted increments (actual $\Delta t$ in hours): $$S = \frac{\mu(\Delta\text{PnL}/\sqrt{\Delta t})}{\sigma(\Delta\text{PnL}/\sqrt{\Delta t})} \times \sqrt{8760}$$

Fees: Polymarket CLOB maker = 0% (confirmed); taker = 2% (borne by counterparty, not us).

Result

Harness runs end-to-end. Grid sweep (7 widths × 3 depths × 3 markets = 63 configs) completes in <5 s. BTC $1M best config: width=0.002, depth=$200 → PnL=$19.91/30d, Sharpe=0.971, 120 fills.

Reproduction

source ~/.pmvenv/bin/activate
python3 /home/workspace/pm_mm_backtest.py --sweep
# Sweep results at /tmp/pm_data/sweep_results.csv

Data snapshots: /tmp/pm_data/{btc_1m,psg_cl,colo_nhl}_prices_f1.json

Failure mode / next step

Critical fill model bias: The crossing-price rule overestimates fills — within a 10-min bar, price may cross our quote and revert without a real taker touching our level. Mitigation: use actual CLOB trade tape (unavailable without auth key) or an intra-bar volatility correction (scale fill prob by $\text{erf}(w / (2\sigma_{\text{bar}}))$). Queue position is also not modeled — real fills compete with other makers at the same price level.

Edit this Idea