Model Validation12 min readJanuary 15, 2025

Regime-Conditioned Block Bootstrap for Trading Strategy Validation

How to validate trading strategies using Monte Carlo simulation with regime-conditional resampling. Addresses the limitation of naive bootstrap that ignores market regime changes.

Monte CarloBootstrapRegime DetectionValidation

The Problem with Naive Bootstrap

Standard Monte Carlo validation for trading systems works like this: take your historical trade returns, randomly resample them with replacement, create a synthetic equity curve, repeat 5,000+ times, and analyze the distribution of outcomes. Simple and effective.

Except it has a critical flaw. It assumes trades are independent and identically distributed (IID). They are not. A trade taken during a trending market has fundamentally different characteristics than one taken during a mean-reverting market. By randomly shuffling all trades together, you destroy the autocorrelation structure that exists in real trading.

This matters because worst-case scenarios often come from consecutive losses during regime changes. If your bootstrap does not preserve this clustering effect, it underestimates tail risk.

Block Bootstrap to the Rescue (Partially)

Block bootstrap preserves local dependence by resampling contiguous blocks of trades instead of individual trades. If you use blocks of 20 trades, the autocorrelation within those 20 trades is preserved.

The challenge is choosing block size. Too small and you are basically doing naive bootstrap. Too large and you do not have enough blocks for meaningful randomization. I use Politis-Romano optimal block length estimation which adapts to the data.

For V7's 4,505 trades, the optimal block length comes out to about 15-25 trades depending on the subsample.

Adding Regime Conditioning

Here is where it gets interesting. Instead of using fixed blocks, I condition the block selection on market regime. The procedure:

1.Label each trade with the regime it occurred in (using S09 K-Means clusters)

2.Group blocks by regime

3.When resampling, maintain the regime sequence from the original data but randomize WITHIN each regime

This preserves both the local dependence (block bootstrap) AND the regime transition structure. The result is a more realistic simulation of how the system would perform through different market conditions.

Implementation Details

The regime labels come from S09 K-Means clustering with 5 regimes: Low-Vol Trending, High-Vol Trending, Low-Vol Range, High-Vol Range, and Transitional. Each trade is tagged with the regime active at entry time.

For V7's validation, I run 5,000 simulations using regime-conditioned block bootstrap. Key results:

•95th percentile max drawdown: 6.79%

•99th percentile max drawdown: 8.10%

•Breach probability (>10% DD): 0.08%

•Mean max drawdown: 3.2%

Compare this to naive bootstrap which gave 0.03% breach probability. The regime-conditioned version correctly identifies MORE risk, which is what we want for conservative estimation.

Why This Matters for FTMO

FTMO has a hard 10% drawdown limit. If your Monte Carlo says 0.03% breach probability but uses naive bootstrap, you might be underestimating your actual risk. The regime-conditioned approach gives a more honest 0.08%. Still well below the 5% target, but more trustworthy.

In production risk management, I would rather have a conservative 0.08% estimate that I can trust than an optimistic 0.03% that might be missing tail risks.

Code Architecture

The implementation lives in the validation pipeline as S31 Walk-Forward (which generates the trade sequence) feeding into the Monte Carlo module. The regime labels are computed once and cached.

Key design decisions:

•Block length: Adaptive via Politis-Romano (typically 15-25 trades)

•Number of simulations: 5,000 (convergence tested up to 20,000)

•Regime transition preservation: Enforced at block boundaries

•Cost modeling: Spread + slippage included in each simulated trade

The Number I Did Not Want to Report

The naive bootstrap gave 0.03% breach probability. The regime-conditioned version gave 0.08%. The tempting result is 0.03% because it makes the system look better. But the right question is which number more accurately represents reality. The higher number does, because it accounts for the clustering of losses during regime transitions that actually happens in live markets. Choosing the more conservative estimate is not pessimism. It is calibration.

Why PBO Matters: Detecting Backtest Overfitting →