Decision Episode
A decision episode is the complete record of a single trading decision: the market context the agent observed, its step-by-step reasoning, the action taken, the outcome that followed, and the counterfactual of what better play was available.
Where a trade log records what was bought and sold, a decision episode records why. That difference is what makes it usable as training data for financial AI.
What it is
Most datasets sold as "trading data" are price history plus trade logs: buy here, sell there, made 2.3%. Training a model on that is like training a medical AI on prescription records without the patient's symptoms, the doctor's reasoning, or whether the treatment worked for the right reasons. The decision episode is the unit of data designed to fix this: a single self-contained record that captures everything needed to understand, audit, and learn from one financial decision.
A complete episode has six components. Market context is everything the agent observed at decision time: multi-timeframe candles, pre-computed indicators, volatility regime, and order book state, strictly point-in-time so no future information leaks in. Agent reasoning is the reasoning trace: the explicit thesis, a confidence score, and phased thoughts (analysis, decision, execution) with the tool calls made at each step. Trade execution records the action itself: entry price, position size, leverage, stop loss, take-profit levels. Position journey tracks how the trade evolved through regular checkpoints, including maximum favorable and adverse excursion (MFE/MAE), the best and worst unrealized P&L the position reached. Outcomes capture what actually happened: result, realized P&L, hold duration. And counterfactuals record what would have happened under different choices: alternative exits, trailing stops, different sizing.
In the UV Labs schema this comes to 62 fields across those 6 categories. Each component carries a distinct training signal: reasoning traces enable process supervision, the journey data enables exit optimization, and counterfactuals multiply the number of lessons extracted from one decision. Drop any component and the model is learning from partial information; it pattern-matches what happened without understanding why.
The deeper point is the difference between knowing what a trading system does and understanding how it thinks. The first is useless for training and debugging. The second is exactly what an episode preserves.
Why it matters for financial AI
Frontier language models learn finance from text: filings, news, textbooks. None of that captures how a real decision is made: the live reasoning, the alternatives weighed and rejected, the sizing logic, the exit plan, and how it all turned out. Post-training on decision episodes is how a model that knows about finance becomes one that can do finance, the prerequisite for agentic trading.
Episodes also solve the luck problem. Market outcomes are noisy: a profitable trade can come from flawed reasoning and a losing trade from sound analysis. Because an episode pairs the reasoning with the outcome and the counterfactual range of outcomes, training can reward sound process rather than lucky results, something a P&L column alone can never support.
The hard part is that this data barely exists. Professional traders don't document their reasoning; they are under time pressure, incentivized to execute rather than explain, and even diligent journalers rarely capture the full context: what was on their screens, which alternatives they considered and rejected, what the exit plan was. Quant funds hold decision data internally and guard it religiously, and there is no standard format for aggregating across sources. Building episodes therefore means building infrastructure that captures the reasoning, the market state, and the execution in one structured record as the decision happens, not reconstructing it afterward.
UV Labs has captured 500K+ full decision episodes from 1M+ live trades over 3 years in production, licensed to AI labs as Decision Data.
{
"symbol": "NVDA",
"direction": "long",
"market_context": {
"regime": "low_vol",
"rsi_4h": 58.2,
"spread_bps": 1.2,
"volatility_regime": "compressing"
},
"agent_reasoning": {
"explicit_reasoning": "NVDA consolidating above 200d MA post-earnings.
Vol compressing, order book imbalance at 0.63. Entering long with
stop below the consolidation range.",
"decision_confidence": 0.84,
"tool_calls": [{ "name": "fetch_technicals", "args": { "tf": "4h" } }]
},
"trade_execution": {
"entry_price": 131.42,
"position_size_usd": 24800,
"stop_loss": 126.50,
"take_profits": [138.00, 142.50]
},
"position_journey": [
{ "at": "4h", "pnl": 1.12, "mfe": 1.38, "mae": -0.12 },
{ "at": "48h", "pnl": 4.72, "mfe": 6.31, "mae": -0.54 }
],
"outcome": {
"result": "win",
"realized_pnl_percent": 4.72,
"timing_score": 0.91
},
"counterfactuals": [{
"mfe_pnl_percent": 6.31,
"trailing_stop_pnl": 5.88,
"exited_too_early": true
}]
}
Common questions
How is a decision episode different from a trade log?
A trade log records what was bought and sold and at what price. A decision episode adds everything around the trade: the market context the agent observed, the reasoning that led to the action, how the position evolved, and what alternative choices would have returned. The log shows what happened; the episode shows why it happened and what should have happened instead.
What fields does a decision episode contain?
In the UV Labs schema, a decision episode contains 62 fields across 6 categories: market context (multi-timeframe candles, indicators, order book state), agent reasoning (explicit reasoning, confidence, phased thoughts with tool calls), trade execution (entry, sizing, stops, targets), position journey (checkpoints with MFE and MAE), outcomes (result, realized P&L, hold duration), and counterfactuals (what alternative exits and stops would have returned).
Why can't financial AI be trained on market data alone?
Market data shows what happened, not what to do about it. It is like teaching poker from hand histories: you learn the rules and some patterns, but not how to think about the game. Models need data that pairs observations with reasoning, actions, and outcomes, which is exactly what a decision episode captures.
Related terms
UV Labs builds post-training decision data for financial AI. Explore Decision Data →