Walk-forward analysis — anti-curve-fitting backtest

Ostrzeżenie · YMYL Ten artykuł ma charakter wyłącznie edukacyjny i nie stanowi rekomendacji inwestycyjnej. Handel na rynku Forex wiąże się z wysokim ryzykiem utraty kapitału — według ESMA 74–89% rachunków detalicznych traci pieniądze.

Anna backtest 5 lat = 75% WR, +40% rocznie. „Holy grail!" — live 6 mies. = 50% WR, -8%. Catastrophe. Anna optimized parameters HISTORY = curve-fit noise. Walk-forward analysis prevented to: 4 lata IS optimization, 1 rok OS test. OS performance ≈ live expected. Tu pokazujemy framework anti-overfitting.

Problem curve-fitting

Curve-fit symptoms
Equity curveTOO smooth (95% trades win)
Sharpe historical> 4 (rare reality)
Parameter specificityMA 14.7 ≠ 14.0 (high sensitivity)
Strategy fragility1% parameter shift = -20% performance
Multiple optimization100+ parameter combinations tested
GapBacktest 70% WR vs Live 45% WR typical

Walk-forward mechanika

  1. Divide history: e.g. 6 lat 2018-2023
  2. IS window (in-sample): 4 lata 2018-2021 — optimize parameters
  3. OS window (out-of-sample): 1 rok 2022 — test fixed parameters
  4. Roll forward: IS 2019-2022, OS 2023
  5. Repeat: aggregate OS across all rolls
  6. Aggregated OS: approximates live performance

Walk-forward efficiency (WFE)

WFE interpretacja
WFE = 100%OS = IS (rare, suspect)
WFE 50-75%Robust strategy ✓
WFE 25-50%Mild curve-fit
WFE < 25%Serious curve-fit ✗
Production-readyWFE > 50% required

Example walk-forward results

Strategia EUR/USD breakout:

  • IS 4-year (2018-2021): 70% WR, +30% rocznie
  • OS 1-year (2022): 55% WR, +12% rocznie
  • WFE = 12/30 = 40%
  • Mild curve-fit, ale acceptable
  • Live expected: ~55% WR, +10% rocznie

Strategia trend-follow:

  • IS 4-year: 60% WR, +25% rocznie
  • OS 1-year: 58% WR, +20% rocznie
  • WFE = 20/25 = 80%
  • Robust, production-ready

Rolling vs Anchored

Rolling vs Anchored walk-forward
Rolling fixed IS4-year IS shifts forward
Anchored growing ISIS grows from fixed start
Rolling prosRegime-adaptive
Rolling consLess stable parameters
Anchored prosMore data, robust
Anchored consSlow regime adapt
StandardRolling — active trading

Tools

  • MetaTrader Strategy Tester: built-in walk-forward option
  • TradeStation EasyLanguage: native feature, industry standard 90s
  • NinjaTrader: built-in advanced
  • Python backtrader: free, programmatic, full control
  • R quantstrat: academic standard, free
  • Custom Excel: simple strategies via VBA

Best practices

  1. OS window minimum 20% of IS (4-year IS → 1-year OS min)
  2. Minimum 5 rolling iterations dla statistical significance
  3. Parameter stability test: > 50% change per iteration = too sensitive
  4. Combinatorially Symmetric Cross-Validation: advanced
  5. Monte Carlo overlay: add confidence intervals
„Walk-forward NIE optional dla algo trading. Standard backtest = optimistic 30-50% gap typical. Walk-forward = realistic expectation. WFE > 50% = production-ready. WFE < 25% = throw away."

Red flags

Throw-away strategy signals
WFE < 25%Serious curve-fit
Parameters jumpingStrategy too sensitive
OS performance< 30% of IS = unreliable
Negative OSMultiple iterations = no edge
OS Sharpe< 0.5 = below baseline
ActionAbandon strategy, NIE deploy live

Production-ready criteria

  • WFE > 50%
  • Parameters stable across iterations
  • Positive across multiple OS periods
  • Low DD OS
  • Reasonable Sharpe (1.0-2.5 OS)
  • Logical strategy rationale (NIE black-box magic)

Krzysiek case

Krzysiek walk-forward success
IS 4-year backtest62% WR, +28% rocznie
OS 1-year57% WR, +17% rocznie
WFE17/28 = 60%
Live deploy decisionYES (WFE > 50%)
Live performance54% WR, +15% rocznie
OS predicted live correctly15% vs 17% expected

Wnioski

Walk-forward analysis = framework anti-curve-fitting backtest. Pro standard od 90s.

Problem: zwykły backtest optimizes historical noise. Backtest 70% WR → live 45% WR typical.

Solution: IS window optimize (4 lata), OS window test (1 rok), roll forward, aggregate.

WFE = OS / IS performance. > 50% = production-ready. < 25% = curve-fit, abandon.

Anna case: backtest 75% WR live 50% = curve-fit catastrophe. Walk-forward would have caught.

Rolling (fixed IS shift) vs Anchored (growing IS). Rolling standard dla active trading.

Tools: MetaTrader, TradeStation, NinjaTrader, Python backtrader, R quantstrat, custom Excel.

Best practices: OS > 20% IS, min 5 iterations, parameter stability, Monte Carlo overlay.

Red flags: WFE < 25%, parameters jumping, OS < 30% IS, negative multiple OS.

Production criteria: WFE > 50%, parameters stable, positive OS, reasonable Sharpe, logical rationale.

Krzysiek case: WFE 60%, OS Sharpe 1.4, Live Sharpe 1.2. Walk-forward correctly predicted real-world.

Powiązane: Monte Carlo complement, backtesting praktyka baseline, expectancy formula baseline metric.

Jarosław Wasiński
O autorze

Jarosław Wasiński

Redaktor naczelny MyBank.pl · Analityk finansowy i rynkowy

Niezależny analityk i praktyk z ponad 20-letnim doświadczeniem w sektorze finansowym. Twórca i redaktor naczelny portalu MyBank.pl, działającego od 2004 roku. Analiza fundamentalna rynków walutowych i makroekonomicznych od 2007 roku.

Źródła i bibliografia

  1. Robert Pardo Evaluation and Optimization of Trading Strategies · walk-forward bible www.amazon.com ↗
  2. CFA Institute Backtesting best practices · industry standard www.cfainstitute.org ↗
  3. TradeStation Walk-forward optimization · platform documentation www.tradestation.com ↗

Najczęstsze pytania

Problem curve-fitting backtest?

Curve-fitting = standard problem backtest. Mechanika: trader optimizes parameters (MA periods 14/50, RSI thresholds 30/70, etc.) historical data 2018-2023 → finds "best" combination = 80% WR, +50% rocznie. Looks amazing! Reality: parameters fit NOISE w historical data, NIE real pattern. Live 2024 = 45% WR, -10% rocznie. Performance gap massive. Why it happens: (1) Over-optimization — testing 100+ parameter combinations finds spurious "best". (2) Lookahead bias — using future data accidentally. (3) Survivorship bias — backtesting only currently-existing pairs. (4) Selection bias — picking strategies that worked, ignoring failed. Symptoms curve-fit strategy: (a) Backtest equity curve TOO smooth (95% trades win). (b) Sharpe > 4 historical (rare reality). (c) Parameters very specific (MA 14.7 vs 14.0 makes big difference). (d) Strategy hates parameter shifts (sensitivity high). Anna case: backtested 5 lat = 75% WR, +40% rocznie. Live 6 mies. = 50% WR, -8%. Curve-fit catastrophe. Walk-forward analysis = avoid this.

Walk-forward mechanika?

Walk-forward = anti-curve-fitting framework. Steps: (1) Divide history: e.g. 6 lat data 2018-2023. (2) In-sample (IS) window: 4 lata 2018-2021. Optimize parameters here. (3) Out-of-sample (OS) window: 1 rok 2022. Test fixed parameters bez change. (4) Roll forward: shift window — IS 2019-2022, OS 2023. (5) Repeat: aggregate OS performance across all rolls. Aggregated OS: this approximates live performance. Walk-forward efficiency (WFE): ratio OS performance / IS performance. WFE = 100% = OS same as IS (very rare, suspect). WFE 50-75% = robust strategy. WFE 25-50% = mild curve-fit. WFE < 25% = serious curve-fit. Example: strategia. IS 4-year: 70% WR, +30% rocznie. OS 1-year: 55% WR, +12% rocznie. WFE = 12/30 = 40%. Mild curve-fit. Live expected: ~55% WR, +10% rocznie. Confidence: WFE 40% = OK, not great. WFE > 50% = production-ready strategy.

Anchored vs rolling walk-forward?

Two walk-forward variants: Rolling walk-forward: fixed-length IS window slides forward. Standard 4-year IS. Roll: 2018-2021 IS → 2022 OS. 2019-2022 IS → 2023 OS. Each iteration same window size. Advantages: parameters adapt to changing market regimes (e.g. low-vol 2017-2019 vs high-vol 2020-2023). Strategy "learns" recent regime. Disadvantages: more parameter switching (less stable). Anchored walk-forward: IS window grows from fixed start. 2018-2021 IS → 2022 OS. 2018-2022 IS → 2023 OS. 2018-2023 IS → 2024 OS. Each iteration starts from same beginning, IS grows. Advantages: more data = more robust parameters. Better dla stable strategies. Disadvantages: slower adapt to regime changes. Decision framework: (1) Trend-following strategies: rolling preferred (regime-adaptive). (2) Mean-reversion stable: anchored OK (more data). (3) Limited data < 5 lat: anchored (squeeze max from data). (4) Long history 10+ lat: rolling (regime sensitivity). Industry standard: rolling walk-forward most common, esp. dla active traders. Anchored for institutional long-term.

Tools + best practices?

Walk-forward tools: MetaTrader Strategy Tester: built-in walk-forward option. Easy interface. EA strategies, optimization, OS validation auto. TradeStation EasyLanguage: walk-forward optimization native feature. Industry standard 90s. NinjaTrader: walk-forward built-in advanced. Python backtrader: free, programmatic. Full control. Long initial setup. Custom Excel: possible dla simple strategies. VBA macros. R quantstrat: academic standard. Free. Best practices: (1) OS window minimum 20% of IS: e.g. 4-year IS → 1-year OS minimum. Larger OS = more confidence. (2) Minimum 5 rolling iterations: 5+ OS periods for statistical significance. (3) Parameter stability test: parameters change > 50% per iteration = strategy too sensitive. (4) Combinatorially Symmetric Cross-Validation: advanced anti-curve-fit (academic). (5) Monte Carlo overlay: add Monte Carlo to OS results for confidence intervals. Red flags: WFE < 25%, parameters jumping wildly between iterations, OS performance < 30% of IS. Production-ready criteria: WFE > 50%, parameters stable, positive across multiple iterations, low DD OS, reasonable Sharpe (1.0-2.5 OS). Krzysiek case: backtested strategy IS WFE 60%, OS Sharpe 1.4, Live performance Sharpe 1.2. Walk-forward correctly predicted real-world performance.

Pogłębij temat · pełny przewodnik