Walk-forward analysis — anti-curve-fitting backtest

Q: Problem curve-fitting backtest?

Curve-fitting = standard problem backtest. Mechanika: trader optimizes parameters (MA periods 14/50, RSI thresholds 30/70, etc.) historical data 2018-2023 → finds "best" combination = 80% WR, +50% rocznie. Looks amazing! Reality: parameters fit NOISE w historical data, NIE real pattern. Live 2024 = 45% WR, -10% rocznie. Performance gap massive. Why it happens: (1) Over-optimization — testing 100+ parameter combinations finds spurious "best". (2) Lookahead bias — using future data accidentally. (3) Survivorship bias — backtesting only currently-existing pairs. (4) Selection bias — picking strategies that worked, ignoring failed. Symptoms curve-fit strategy: (a) Backtest equity curve TOO smooth (95% trades win). (b) Sharpe > 4 historical (rare reality). (c) Parameters very specific (MA 14.7 vs 14.0 makes big difference). (d) Strategy hates parameter shifts (sensitivity high). Anna case: backtested 5 lat = 75% WR, +40% rocznie. Live 6 mies. = 50% WR, -8%. Curve-fit catastrophe. Walk-forward analysis = avoid this.

Q: Walk-forward mechanika?

Walk-forward = anti-curve-fitting framework. Steps: (1) Divide history: e.g. 6 lat data 2018-2023. (2) In-sample (IS) window: 4 lata 2018-2021. Optimize parameters here. (3) Out-of-sample (OS) window: 1 rok 2022. Test fixed parameters bez change. (4) Roll forward: shift window — IS 2019-2022, OS 2023. (5) Repeat: aggregate OS performance across all rolls. Aggregated OS: this approximates live performance. Walk-forward efficiency (WFE): ratio OS performance / IS performance. WFE = 100% = OS same as IS (very rare, suspect). WFE 50-75% = robust strategy. WFE 25-50% = mild curve-fit. WFE < 25% = serious curve-fit. Example: strategia. IS 4-year: 70% WR, +30% rocznie. OS 1-year: 55% WR, +12% rocznie. WFE = 12/30 = 40%. Mild curve-fit. Live expected: ~55% WR, +10% rocznie. Confidence: WFE 40% = OK, not great. WFE > 50% = production-ready strategy.

Q: Anchored vs rolling walk-forward?

Two walk-forward variants: Rolling walk-forward: fixed-length IS window slides forward. Standard 4-year IS. Roll: 2018-2021 IS → 2022 OS. 2019-2022 IS → 2023 OS. Each iteration same window size. Advantages: parameters adapt to changing market regimes (e.g. low-vol 2017-2019 vs high-vol 2020-2023). Strategy "learns" recent regime. Disadvantages: more parameter switching (less stable). Anchored walk-forward: IS window grows from fixed start. 2018-2021 IS → 2022 OS. 2018-2022 IS → 2023 OS. 2018-2023 IS → 2024 OS. Each iteration starts from same beginning, IS grows. Advantages: more data = more robust parameters. Better dla stable strategies. Disadvantages: slower adapt to regime changes. Decision framework: (1) Trend-following strategies: rolling preferred (regime-adaptive). (2) Mean-reversion stable: anchored OK (more data). (3) Limited data < 5 lat: anchored (squeeze max from data). (4) Long history 10+ lat: rolling (regime sensitivity). Industry standard: rolling walk-forward most common, esp. dla active traders. Anchored for institutional long-term.

Q: Tools + best practices?

Walk-forward tools: MetaTrader Strategy Tester: built-in walk-forward option. Easy interface. EA strategies, optimization, OS validation auto. TradeStation EasyLanguage: walk-forward optimization native feature. Industry standard 90s. NinjaTrader: walk-forward built-in advanced. Python backtrader: free, programmatic. Full control. Long initial setup. Custom Excel: possible dla simple strategies. VBA macros. R quantstrat: academic standard. Free. Best practices: (1) OS window minimum 20% of IS: e.g. 4-year IS → 1-year OS minimum. Larger OS = more confidence. (2) Minimum 5 rolling iterations: 5+ OS periods for statistical significance. (3) Parameter stability test: parameters change > 50% per iteration = strategy too sensitive. (4) Combinatorially Symmetric Cross-Validation: advanced anti-curve-fit (academic). (5) Monte Carlo overlay: add Monte Carlo to OS results for confidence intervals. Red flags: WFE < 25%, parameters jumping wildly between iterations, OS performance < 30% of IS. Production-ready criteria: WFE > 50%, parameters stable, positive across multiple iterations, low DD OS, reasonable Sharpe (1.0-2.5 OS). Krzysiek case: backtested strategy IS WFE 60%, OS Sharpe 1.4, Live performance Sharpe 1.2. Walk-forward correctly predicted real-world performance.

Jarosław Wasiński · 17.05.2026 · 6 min czytania

Ostrzeżenie · YMYL Ten artykuł ma charakter wyłącznie edukacyjny i nie stanowi rekomendacji inwestycyjnej. Handel na rynku Forex wiąże się z wysokim ryzykiem utraty kapitału — według ESMA 74–89% rachunków detalicznych traci pieniądze.

Krótka odpowiedź

Walk-forward analysis = framework backtest anti-curve-fitting. Problem: zwykły backtest optimizes parameters NA historical data → strategy zawyżająca PnL (curve-fit). Live performance dramatically worse. Walk-forward solution: (1) Divide history e.g. 5 lat. (2) In-sample (4 lata): optimize parameters. (3) Out-of-sample (1 rok): test optimized parameters bez change. (4) Repeat rolling window (year 2-5 IS, year 6 OS). Output: out-of-sample performance ≈ live performance expected. Test: optimization on 2018-2022 → 70% WR. Out-of-sample 2023 → 55% WR. Real edge weaker than appears. Walk-forward efficiency: OS / IS ratio. WFE > 0.5 = robust strategy. < 0.3 = curve-fit. Tools: MetaTrader Strategy Tester, TradeStation, Python backtrader, custom Excel. Pro standard.

Anna backtest 5 lat = 75% WR, +40% rocznie. „Holy grail!" — live 6 mies. = 50% WR, -8%. Catastrophe. Anna optimized parameters HISTORY = curve-fit noise. Walk-forward analysis prevented to: 4 lata IS optimization, 1 rok OS test. OS performance ≈ live expected. Tu pokazujemy framework anti-overfitting.

Problem curve-fitting

Curve-fit symptoms

Equity curveTOO smooth (95% trades win)

Sharpe historical> 4 (rare reality)

Parameter specificityMA 14.7 ≠ 14.0 (high sensitivity)

Strategy fragility1% parameter shift = -20% performance

Multiple optimization100+ parameter combinations tested

GapBacktest 70% WR vs Live 45% WR typical

Walk-forward mechanika

Divide history: e.g. 6 lat 2018-2023
IS window (in-sample): 4 lata 2018-2021 — optimize parameters
OS window (out-of-sample): 1 rok 2022 — test fixed parameters
Roll forward: IS 2019-2022, OS 2023
Repeat: aggregate OS across all rolls
Aggregated OS: approximates live performance

Walk-forward efficiency (WFE)

WFE interpretacja

WFE = 100%OS = IS (rare, suspect)

WFE 50-75%Robust strategy ✓

WFE 25-50%Mild curve-fit

WFE < 25%Serious curve-fit ✗

Production-readyWFE > 50% required

Example walk-forward results

Strategia EUR/USD breakout:

IS 4-year (2018-2021): 70% WR, +30% rocznie
OS 1-year (2022): 55% WR, +12% rocznie
WFE = 12/30 = 40%
Mild curve-fit, ale acceptable
Live expected: ~55% WR, +10% rocznie

Strategia trend-follow:

IS 4-year: 60% WR, +25% rocznie
OS 1-year: 58% WR, +20% rocznie
WFE = 20/25 = 80%
Robust, production-ready

Rolling vs Anchored

Rolling vs Anchored walk-forward

Rolling fixed IS4-year IS shifts forward

Anchored growing ISIS grows from fixed start

Rolling prosRegime-adaptive

Rolling consLess stable parameters

Anchored prosMore data, robust

Anchored consSlow regime adapt

StandardRolling — active trading

Tools

MetaTrader Strategy Tester: built-in walk-forward option
TradeStation EasyLanguage: native feature, industry standard 90s
NinjaTrader: built-in advanced
Python backtrader: free, programmatic, full control
R quantstrat: academic standard, free
Custom Excel: simple strategies via VBA

Best practices

OS window minimum 20% of IS (4-year IS → 1-year OS min)
Minimum 5 rolling iterations dla statistical significance
Parameter stability test: > 50% change per iteration = too sensitive
Combinatorially Symmetric Cross-Validation: advanced
Monte Carlo overlay: add confidence intervals

„Walk-forward NIE optional dla algo trading. Standard backtest = optimistic 30-50% gap typical. Walk-forward = realistic expectation. WFE > 50% = production-ready. WFE < 25% = throw away."

Red flags

Throw-away strategy signals

WFE < 25%Serious curve-fit

Parameters jumpingStrategy too sensitive

OS performance< 30% of IS = unreliable

Negative OSMultiple iterations = no edge

OS Sharpe< 0.5 = below baseline

ActionAbandon strategy, NIE deploy live

Production-ready criteria

WFE > 50%
Parameters stable across iterations
Positive across multiple OS periods
Low DD OS
Reasonable Sharpe (1.0-2.5 OS)
Logical strategy rationale (NIE black-box magic)

Krzysiek case

Krzysiek walk-forward success

IS 4-year backtest62% WR, +28% rocznie

OS 1-year57% WR, +17% rocznie

WFE17/28 = 60%

Live deploy decisionYES (WFE > 50%)

Live performance54% WR, +15% rocznie

OS predicted live correctly15% vs 17% expected

Wnioski

Walk-forward analysis = framework anti-curve-fitting backtest. Pro standard od 90s.

Problem: zwykły backtest optimizes historical noise. Backtest 70% WR → live 45% WR typical.

Solution: IS window optimize (4 lata), OS window test (1 rok), roll forward, aggregate.

WFE = OS / IS performance. > 50% = production-ready. < 25% = curve-fit, abandon.

Anna case: backtest 75% WR live 50% = curve-fit catastrophe. Walk-forward would have caught.

Rolling (fixed IS shift) vs Anchored (growing IS). Rolling standard dla active trading.

Tools: MetaTrader, TradeStation, NinjaTrader, Python backtrader, R quantstrat, custom Excel.

Best practices: OS > 20% IS, min 5 iterations, parameter stability, Monte Carlo overlay.

Red flags: WFE < 25%, parameters jumping, OS < 30% IS, negative multiple OS.

Production criteria: WFE > 50%, parameters stable, positive OS, reasonable Sharpe, logical rationale.

Krzysiek case: WFE 60%, OS Sharpe 1.4, Live Sharpe 1.2. Walk-forward correctly predicted real-world.

Powiązane: Monte Carlo complement, backtesting praktyka baseline, expectancy formula baseline metric.

O autorze

Jarosław Wasiński

Redaktor naczelny MyBank.pl · Analityk finansowy i rynkowy

Niezależny analityk i praktyk z ponad 20-letnim doświadczeniem w sektorze finansowym. Twórca i redaktor naczelny portalu MyBank.pl, działającego od 2004 roku. Analiza fundamentalna rynków walutowych i makroekonomicznych od 2007 roku.

Źródła i bibliografia

Robert Pardo Evaluation and Optimization of Trading Strategies · walk-forward bible www.amazon.com ↗
CFA Institute Backtesting best practices · industry standard www.cfainstitute.org ↗
TradeStation Walk-forward optimization · platform documentation www.tradestation.com ↗

Najczęstsze pytania

Problem curve-fitting backtest?

Curve-fitting = standard problem backtest. Mechanika: trader optimizes parameters (MA periods 14/50, RSI thresholds 30/70, etc.) historical data 2018-2023 → finds "best" combination = 80% WR, +50% rocznie. Looks amazing! Reality: parameters fit NOISE w historical data, NIE real pattern. Live 2024 = 45% WR, -10% rocznie. Performance gap massive. Why it happens: (1) Over-optimization — testing 100+ parameter combinations finds spurious "best". (2) Lookahead bias — using future data accidentally. (3) Survivorship bias — backtesting only currently-existing pairs. (4) Selection bias — picking strategies that worked, ignoring failed. Symptoms curve-fit strategy: (a) Backtest equity curve TOO smooth (95% trades win). (b) Sharpe > 4 historical (rare reality). (c) Parameters very specific (MA 14.7 vs 14.0 makes big difference). (d) Strategy hates parameter shifts (sensitivity high). Anna case: backtested 5 lat = 75% WR, +40% rocznie. Live 6 mies. = 50% WR, -8%. Curve-fit catastrophe. Walk-forward analysis = avoid this.

Walk-forward mechanika?

Walk-forward = anti-curve-fitting framework. Steps: (1) Divide history: e.g. 6 lat data 2018-2023. (2) In-sample (IS) window: 4 lata 2018-2021. Optimize parameters here. (3) Out-of-sample (OS) window: 1 rok 2022. Test fixed parameters bez change. (4) Roll forward: shift window — IS 2019-2022, OS 2023. (5) Repeat: aggregate OS performance across all rolls. Aggregated OS: this approximates live performance. Walk-forward efficiency (WFE): ratio OS performance / IS performance. WFE = 100% = OS same as IS (very rare, suspect). WFE 50-75% = robust strategy. WFE 25-50% = mild curve-fit. WFE < 25% = serious curve-fit. Example: strategia. IS 4-year: 70% WR, +30% rocznie. OS 1-year: 55% WR, +12% rocznie. WFE = 12/30 = 40%. Mild curve-fit. Live expected: ~55% WR, +10% rocznie. Confidence: WFE 40% = OK, not great. WFE > 50% = production-ready strategy.

Anchored vs rolling walk-forward?

Two walk-forward variants: Rolling walk-forward: fixed-length IS window slides forward. Standard 4-year IS. Roll: 2018-2021 IS → 2022 OS. 2019-2022 IS → 2023 OS. Each iteration same window size. Advantages: parameters adapt to changing market regimes (e.g. low-vol 2017-2019 vs high-vol 2020-2023). Strategy "learns" recent regime. Disadvantages: more parameter switching (less stable). Anchored walk-forward: IS window grows from fixed start. 2018-2021 IS → 2022 OS. 2018-2022 IS → 2023 OS. 2018-2023 IS → 2024 OS. Each iteration starts from same beginning, IS grows. Advantages: more data = more robust parameters. Better dla stable strategies. Disadvantages: slower adapt to regime changes. Decision framework: (1) Trend-following strategies: rolling preferred (regime-adaptive). (2) Mean-reversion stable: anchored OK (more data). (3) Limited data < 5 lat: anchored (squeeze max from data). (4) Long history 10+ lat: rolling (regime sensitivity). Industry standard: rolling walk-forward most common, esp. dla active traders. Anchored for institutional long-term.

Tools + best practices?

Walk-forward tools: MetaTrader Strategy Tester: built-in walk-forward option. Easy interface. EA strategies, optimization, OS validation auto. TradeStation EasyLanguage: walk-forward optimization native feature. Industry standard 90s. NinjaTrader: walk-forward built-in advanced. Python backtrader: free, programmatic. Full control. Long initial setup. Custom Excel: possible dla simple strategies. VBA macros. R quantstrat: academic standard. Free. Best practices: (1) OS window minimum 20% of IS: e.g. 4-year IS → 1-year OS minimum. Larger OS = more confidence. (2) Minimum 5 rolling iterations: 5+ OS periods for statistical significance. (3) Parameter stability test: parameters change > 50% per iteration = strategy too sensitive. (4) Combinatorially Symmetric Cross-Validation: advanced anti-curve-fit (academic). (5) Monte Carlo overlay: add Monte Carlo to OS results for confidence intervals. Red flags: WFE < 25%, parameters jumping wildly between iterations, OS performance < 30% of IS. Production-ready criteria: WFE > 50%, parameters stable, positive across multiple iterations, low DD OS, reasonable Sharpe (1.0-2.5 OS). Krzysiek case: backtested strategy IS WFE 60%, OS Sharpe 1.4, Live performance Sharpe 1.2. Walk-forward correctly predicted real-world performance.

Pogłębij temat · pełny przewodnik

Czytasz Walk-forward analysis — anti-curve-fitting backtest