Your Backtest Is Lying (By a Known Amount)
Calculates realistic 15% performance haircut accounting for data mining, multiple testing, and implementation shortfall.
The Three Sources of Overstated Performance
Every backtest overstates live performance for three quantifiable reasons. First, data mining bias: we tested many configurations and kept the best one. Second, multiple testing: we evaluated multiple feature sets, thresholds, and model architectures. Third, implementation shortfall: the gap between simulated and actual execution.
S42 quantifies each source and applies a combined haircut to the raw backtest results. Data mining bias accounts for approximately 7% haircut. Multiple testing adds 5%. Implementation shortfall adds 3%. The total haircut is approximately 15%.
How the Haircut Is Calculated
Data mining bias is estimated using the formula from White's Reality Check: haircut scales with the logarithm of the number of configurations tested. V7 tested approximately 200 parameter configurations during development. This translates to a 7% expected bias.
Multiple testing bias uses the Bonferroni-like correction for the number of independent tests conducted. With 38 features, 6 clusters, and 5 model architectures, the effective number of independent tests is approximately 50, yielding a 5% adjustment.
Implementation shortfall is estimated from the difference between S20's slippage model and zero-slippage results, plus an additional buffer for latency, requoting, and platform-specific issues. The 3% figure is conservative.
Why 15% Is Acceptable
A 15% haircut on 533.9R means the realistic expected total R is approximately 454R over 7.5 years. That is still strongly positive. More importantly, the haircut-adjusted monthly return of approximately 3.7% still exceeds the FTMO challenge requirement of 10% in 30 days (achievable in roughly 2.7 months at 3.7%/month). S42 exists to prevent self-deception. It is easy to fall in love with your backtest results and forget that they are the ceiling of expected performance, not the floor. Applying a systematic haircut keeps expectations grounded and ensures the system is built to succeed even under conservative assumptions.