Methodology
Transparent, data-driven sports analytics built on proven quantitative methods from finance and statistics.
Want the exact numbers? See our full model configs and validation results →
What is Expected Value (EV)?
Expected Value (EV) is the mathematical edge you have on a bet. It measures the average return you can expect per dollar allocated over the long run.
The formula is straightforward: EV = (True Probability × Decimal Odds) − 1. A positive EV (+EV) means the position is profitable in expectation.
We only publish predictions with EV above strict per-sport thresholds (5–12.5% depending on the sport), ensuring every recommendation has meaningful mathematical edge.
How Our ML Models Work
We run a dedicated XGBoost machine learning model for each sport. Each model is trained on sport-specific features, validated out-of-sample, and tested for statistical significance via permutation tests.
Soccer Draw: A 3-class XGBoost classifier (Home/Draw/Away) trained on 20 features including Elo ratings (K=48), attack/defense metrics, and market-implied probabilities. We only surface draw predictions in the English Premier League and French Ligue 1 — leagues where draw mispricing is statistically significant.
UFC Favorites: A 5-seed XGBoost ensemble with L2=10 regularization trained on 33 features covering physical attributes, Elo ratings, fight stats, striking/grappling differentials, and career history.
Boxing Favorites: A 5-seed XGBoost ensemble with L2=10 using 28 features (the “no-form” set which drops streak/recent-form features but keeps market-implied probabilities).
All models use isotonic calibration on held-out data and walk-forward training with data augmentation (swap home/away or fighter A/B for balanced labels).
Calculated Position Sizing
Finding +EV predictions is only half the equation. Calculated position sizing tells you how much to allocate to maximize long-term capital growth while controlling risk.
We use quarter-Kelly sizing — a conservative fraction that retains ~90% of full Kelly growth while cutting drawdowns by ~50%. Every prediction includes a recommended allocation percentage.
Our backtested portfolio shows a $1,000 starting capital growing to $4,172 (+317%) over 2013–2025 using this sizing approach across all three sports.
Walk-Forward Validation
We validate our models using walk-forward analysis, the gold standard for testing predictive models in finance and sports analytics.
The process: train the model on historical data up to date T, make predictions for the next period, record the results, then roll forward. This eliminates look-ahead bias entirely. Features are computed BEFORE state updates — no future information leaks into predictions.
Our models are retrained on fresh data every 120 days to capture evolving trends while maintaining sufficient sample size.
Every strategy must pass ALL validation gates: positive holdout ROI, permutation test p < 0.05, replication across multiple regularization values, and pre-registration of hypotheses before seeing results.
Permutation Testing
We test statistical significance by shuffling the model’s prediction signal across all matches (keeping outcomes and odds fixed), then re-applying the prediction filter. This tests whether the model’s SELECTION is better than random selection from the same odds pool.
All three models pass with p < 0.01. Boxing achieves p = 0.0000 (5,000 permutations, none beat the actual model). This is strong evidence that the edge is real, not an artifact of data mining.
We explicitly avoid two common permutation mistakes: (1) shuffling outcomes within selected bets (preserves elevated win rate → invalid), and (2) shuffling the P&L array (preserves the mean → always p ≈ 1.0).
Why Closing Line Value (CLV) Matters
Closing Line Value is the single best predictor of long-term prediction profitability. It measures whether you consistently get better odds than the closing line.
We track CLV on every prediction. Our historical CLV of +3.8% means we are, on average, capturing prices 3.8% better than the final market price.
Even in periods of variance where results may be negative, positive CLV indicates the underlying edge is real and profits will materialize with sufficient sample size.