Skip to content
Glossary

Counterfactual

Definition

In financial AI, a counterfactual is the computed outcome of a decision that was not taken: what a trade would have returned with a different exit, size, stop, or timing, derived from the actual price path that followed the decision.

You can only make one decision, but you can observe many alternative outcomes after the fact. Counterfactual learning is how scarce trading decisions become abundant training signal.

What it is

AlphaGo could play millions of games against itself; financial markets allow no such luxury. Each real decision happens once, produces one outcome, and costs real money. But there is a loophole: once a decision is recorded, the market data that followed tells you exactly what every alternative would have returned. A trader enters BTC at $65,000 and exits at $67,000 for +3%, yet during the trade, price touched $68,500 and dipped to $63,500. That one trade contains at least five exit outcomes: the actual +3%, the high at +5.4%, a stop-out at -2.3%, a 2% trailing stop at +4.2%, a 5% trailing stop at +5.0%.

Two counterfactual metrics anchor the analysis. Maximum Favorable Excursion (MFE) is the best unrealized profit available during the trade: the benchmark for exit optimization. A trade that made 3% but could have made 5.4% teaches something different from a trade that made 3% at its peak. Maximum Adverse Excursion (MAE) is the worst unrealized loss endured: the true risk that was taken. A profitable trade with deep MAE was a high-risk trade that got lucky, and only MAE reveals it.

The pair carries more information than it first appears. Two trades with identical realized outcomes but different MAE are fundamentally different trades: one was smooth sailing, the other a drawdown roller coaster that happened to end well, and only the excursion data distinguishes them. Across many episodes, the distribution of where MFE typically occurs calibrates realistic take-profit targets, while MAE distributions show whether stops are placed appropriately or so tight they convert recoverable dips into losses.

On top of these sit trailing-stop simulations (what each alternative exit policy would have returned), a timing score that normalizes exit quality (capturing 3.08% of an available 5.42% scores 0.56), and discrete error flags like exited_too_early or held_too_long that name the specific failure mode rather than just declaring an exit suboptimal. Counted up, one decision yields nine or more training signals.

Computing all this requires infrastructure: the full price path (not just entry and exit), precise timestamps, slippage modeling, and regular position checkpoints, which is why counterfactuals usually live inside replayable environments and complete decision episodes. Execution realism matters too: a 2% trailing stop does not exit at exactly -2%, so honest counterfactuals price in the slippage and costs the alternative would actually have faced.

Why it matters for financial AI

Financial AI is bottlenecked by data scarcity: quality decision data is expensive to generate and limited in quantity. Counterfactuals multiply the training value of each decision. They feed reward shaping (credit relative to what was achievable, not raw P&L), contrastive learning (actual versus alternative trajectories), error classification, and calibration training on realistic outcome distributions.

They are also the key to separating luck from skill. Combined with a reasoning trace, the range of possible outcomes lets you estimate how much of a result came from decision quality versus variance: a trade in the 73rd percentile of its outcome range with a 68% skill component is legitimately good; one in the 95th percentile with a 20% skill component got lucky. Outcome-only training, like naive backtesting, cannot make that distinction.

Every decision episode in UV Labs' Decision Data ships with a computed counterfactuals block alongside the reasoning and outcome.

Counterfactual block
{
  "counterfactuals": {
    "mfe_price": 68520,
    "mfe_pnl_percent": 5.42,
    "mae_price": 63480,
    "mae_pnl_percent": -2.34,
    "pnl_trailing_2pct": 4.18,
    "pnl_trailing_3pct": 4.82,
    "pnl_trailing_5pct": 5.02,
    "actual_pnl_percent": 3.08,
    "optimal_pnl_percent": 5.42,
    "timing_score": 0.56,
    "held_too_long": false,
    "exited_too_early": true
  }
}

Common questions

How are counterfactuals different from backtesting?

A backtest runs a whole strategy across historical data to estimate aggregate performance. A counterfactual evaluates alternatives at a single real decision point, a decision someone actually made, with real reasoning and real stakes. Counterfactuals are anchored to genuine decisions; backtests evaluate hypothetical ones.

How many training signals can one trade produce?

A single trade with counterfactual analysis yields nine or more signals: the actual outcome, the MFE comparison, the MAE analysis, several trailing-stop simulations, a timing score, and discrete error flags such as exited_too_early or held_too_long. Counterfactuals multiply scarce decision data into abundant training signal.

Are counterfactual outcomes simulated?

No. They are computed from the actual recorded price path after the decision. Because the alternative action at the recorded position size would not have materially moved the market, the outcome of a different exit or stop is directly computable from real data, though realistic counterfactuals still model slippage and execution costs.

Related terms

UV Labs builds post-training decision data for financial AI. Explore Decision Data →