[polymarket][tsstat] D4 time-series factor pipeline — INCONCLUSIVE

active
polymarkettsstatfactor-pipelineinconclusive   Priority: 3   Source: polymarket-tsstat   Created: 2026-05-20   Updated: 2026-05-20

Hypothesis

Five pure price-action factors (1d momentum, 7d momentum, realised volatility, distance-from-50, time-to-resolution) computed in logit-space on Polymarket binary markets have non-zero information coefficient vs 24h-forward logit returns.

Data used

market_id  t                          price   mom_1d  mom_7d  rvol_24h  dist_50  tte_days  fwd_ret_1d
573655     2026-04-27 21:00:00+00:00  0.0205  0.5028  0.5872  0.1791   -0.4795  64.29     0.2234
573655     2026-04-27 22:00:00+00:00  0.0205  0.5028  0.5872  0.1791   -0.4795  64.25     0.2234

Method

Prices $p_t \in [\varepsilon, 1-\varepsilon]$, $\varepsilon=10^{-4}$.
Logit return: $r_t = \text{logit}(p_t) - \text{logit}(p_{t-1})$
Factors at time $t$ (no look-ahead — only use $p_{s}$ for $s \le t$): - $\text{mom_1d}t = \text{logit}(p{t}) - \text{logit}(p_{t-24})$ - $\text{mom_7d}t = \text{logit}(p{t}) - \text{logit}(p_{t-168})$ - $\text{rvol_24h}t = \text{std}(r{t-23:t})$ - $\text{dist_50}t = p_t - 0.5$ - $\text{tte_days}_t = (t{\text{resolve}} - t) / 86400$
Forward return: $f_t = \text{logit}(p_{t+24}) - \text{logit}(p_t)$
Cross-sectional IC at each hour $h$: Pearson($\text{factor}h$, $f_h$) over all markets with data at $h$. AR(1)-corrected effective N: $n{\text{eff}} = n \cdot \frac{1-\rho_1}{1+\rho_1}$

Result

factor IC_mean RankIC_mean AC1 eff_n t_AR1
mom_1d -0.0549 -0.0356 0.755 73.6 -1.91
tte_days -0.0188 -0.0916 0.846 44.1 -0.72
mom_7d -0.0008 0.0188 0.900 27.7 -0.01
rvol_24h 0.0060 0.0196 0.843 44.9 +0.16
dist_50 0.0146 0.0536 0.948 14.1 +0.23

No factor clears |t| > 2.0. mom_1d is closest (mean-reversion direction, t = -1.91, p ≈ 0.06).

Reproduction

source ~/.pmvenv/bin/activate
cd /mnt/projects/tnt_85c10df4451042ca/prj_c7cb91b70b2f42ac
python d4_tsstat.py
# Results → /tmp/pm_data/d4_results_final.json

Failure mode / next step

Binding data constraint: CLOB prices-history?interval=max returns only ~30 days. After 7-day warmup, effective cross-section is 22 days. Older markets (opened July 2025) are inaccessible via Gamma API (max offset = 10k, and those markets are at offset > 15k). To extend: (a) use Polymarket subgraph on Polygon to get older trade data, (b) use closed=true markets for longer retrospective, or (c) wait for time to pass. Small universe: Only 25 of 100 fetched markets have non-frozen prices. Most others are near-zero probability markets (e.g. team-wins-World-Cup) with essentially no price movement.

Edit this Idea