LSTM Exit Timing: Formulating Exits as Sequence Prediction
How LSTM networks can predict optimal exit timing by learning from trade trajectory sequences.
Why Exits Are Harder Than Entries
There is a saying in trading: "Entries are easy, exits are hard." After four years of system building I can confirm this is absolutely true.
Entry signals are static decisions. Given current market conditions, should I open a position? You have a snapshot of features and predict a binary outcome. Standard classification problem.
Exits are sequential decisions. The optimal exit depends on:
This is a sequence prediction problem, and LSTMs (Long Short-Term Memory networks) are specifically designed for sequence data.
The Trade Trajectory Representation
Each open trade generates a sequence of observations at every bar:
Trade-specific features (10):
Market state features (20):
Total: 30 features per bar, forming a variable-length sequence.
LSTM Architecture
The L3 exit LSTM uses a relatively simple architecture:
Why 64 hidden units? Tested 32, 64, 128, 256. 64 gave the best bias-variance tradeoff. Larger models overfit to training trade trajectories.
Why 2 layers? Single layer could not capture complex trajectory patterns (like "reached 2R, pulled back to 1R, now recovering to 1.5R"). Three layers showed no improvement over two and trained 40% slower.
Training Data Construction
The trickiest part of LSTM exit models is constructing training labels. What constitutes an "optimal" exit?
I use hindsight-optimal labeling:
This creates a class imbalance (most bars are HOLD, few are EXIT) which I handle with:
Pre-Training with S22
The S22 LSTM Pre-training module addresses the cold-start problem. Instead of training from random initialization:
Pre-training improved exit timing accuracy by approximately 8% on out-of-sample data. The first layer learns general market dynamics; the second layer specializes in exit prediction.
Integration with Regime Rules
The LSTM output (exit probability 0-1) does not directly trigger exits. It is combined with regime-specific giveback rules:
Exit triggered when ANY of:
The LSTM handles "smart" exits, recognizing trajectory patterns that suggest the optimal exit point. The rule-based exits handle risk management, cutting losses and preventing excessive giveback.
Results
Comparing L3 LSTM exits vs fixed trailing stop exits:
| Metric | LSTM Exit | Fixed Trailing |
|---|---|---|
| Avg R per trade | 0.118R | 0.092R |
| Win Rate | 59.2% | 56.8% |
| Profit Factor | 1.60 | 1.43 |
| Avg Hold Time | 8.3 bars | 11.7 bars |
The LSTM exits faster (8.3 vs 11.7 bars average) while capturing more profit per trade. It is particularly good at identifying trades that have peaked and are about to reverse. The mfe_ratio feature is the strongest predictor of exit timing.
Practical Considerations
Sequence length limits: I cap sequences at 50 bars. Trades lasting longer than 50 bars are rare and the LSTM does not have enough training data for very long sequences.
Inference speed: LSTM inference needs to happen every bar for every open trade. With batch processing across positions, this takes approximately 2ms per bar. Fast enough for M15 but potentially an issue for tick-level strategies.
Model staleness: The LSTM is retrained quarterly on the most recent 2 years of trade data. Market dynamics evolve, and the exit patterns that were optimal in 2023 may not be optimal in 2025.
Complementary RL agent (S13): The PPO agent provides a second opinion on exit timing. When LSTM and RL agree, exit confidence is high. When they disagree, the system defaults to the more conservative option (usually exit).
Does the Complexity Earn Its Place?
The LSTM improved average R per trade from 0.092 to 0.118. That is a real improvement. But it is important to ask: how much of that is the LSTM being smart vs the rule-based exits being bad? When I tested a well-tuned trailing stop (not the naive fixed one), the gap narrowed significantly. The LSTM still won, but by maybe 0.01R per trade, not 0.026R. Being honest about effect sizes matters. The LSTM adds value. But most of the exit performance comes from the regime-conditional giveback rules, which are simple and interpretable. Complexity should earn its place, and knowing exactly how much value each component provides is the only way to know if it has.