Skip to content
Glossary

Reasoning Trace

Definition

A reasoning trace is the step-by-step record of how a decision was made (what data was examined, what options were considered, what tools were called, and why a particular action was chosen), captured at the moment the decision happens.

It is the "show your work" of decision-making: the part of a trade that never appears in price data, and the part a model actually needs to learn from.

What it is

A reasoning trace turns an opaque decision into an inspectable sequence. In a structured trace, each step is tagged with a phase: analysis (what observations matter and why), decision (how observations convert to an actionable conclusion), and execution (how the conclusion becomes specific order parameters). Steps record the tools called along the way and a self-assessed confidence score for the final call.

It is related to, but distinct from, chain-of-thought. Chain-of-thought is a prompting technique that elicits step-by-step thinking from a model. A reasoning trace is the recorded artifact (including tool calls and confidence) of a decision that actually executed. The most effective pattern for financial decisions is ReAct (Reasoning and Acting), which interleaves thinking with information gathering: think about what you need to know, call a tool to get it, observe the result, and repeat until ready to act. This mirrors how human traders actually work: they don't analyze everything upfront, they iteratively gather information based on what they've learned.

Capture discipline matters as much as format. A trace reconstructed after the outcome is known is post-hoc rationalization, not reasoning; useful traces are captured in real time and linked to execution, so "I decided X" can never sit beside an order log that shows Y. Free-form text is also weak training signal; structured phases, explicit tool calls, and confidence scores are what make traces learnable. Finally, not all reasoning is good reasoning: human feedback on trace quality, not just on outcomes, is what filters a raw stream of decisions into a training set worth learning from.

Confidence scoring deserves special mention. A field like decision_confidence: 0.72, paired with outcomes over many episodes, trains calibration: the model learns that when it says 70%, it should be right about 70% of the time, and learns which situations warrant conviction versus hedging.

Why it matters for financial AI

Here is a trade: BTC long at $65,000, closed at $67,000, +3%. Good or bad? You cannot tell. Maybe it was a textbook breakout entry with disciplined risk management; maybe someone aped in on a tweet. The outcome is identical, the reasoning is opposite, and a model trained on outcomes alone has no idea which behavior to replicate. Behavioral patterns like FOMO entries show meaningfully lower win rates than planned entries, but that difference only exists in reasoning data, never in the P&L column.

Traces are also what make process supervision possible. OpenAI's "Let's Verify Step by Step" research found that rewarding each correct reasoning step significantly outperforms judging only final answers, and the effect is stronger in markets, where noise routinely rewards bad decisions. As a bonus, traces make deployed systems auditable: you can see how the model is thinking, not just what it concluded.

Trained on high-quality traces, models develop capabilities outcome-only training cannot produce: structured analysis that breaks a situation into components, appropriate information seeking (knowing what to look at and what to ignore), consistent frameworks for converting analysis into action, risk awareness woven into every decision, and self-monitoring (recognizing when their own reasoning is solid versus shaky). These are the capabilities that separate usable financial AI from models that can merely talk about finance.

Every decision episode in UV Labs' Decision Data carries a full trace from intent through execution, captured live from agents operating real capital.

Trace (condensed)
{
  "agent_reasoning": {
    "explicit_reasoning": "BTC showing strength above 65k with decreasing
      sell pressure. 4h RSI resetting from overbought. Looking for
      continuation to 68k resistance.",
    "decision_confidence": 0.72,
    "thoughts": [
      {
        "phase": "analysis",
        "reasoning_type": "chain-of-thought",
        "output": "Market structure bullish on higher timeframes...",
        "tool_calls": 3
      },
      {
        "phase": "decision",
        "reasoning_type": "react",
        "output": "Entry criteria met. Sizing for 2% portfolio risk...",
        "tool_calls": 1
      },
      {
        "phase": "execution",
        "reasoning_type": "react",
        "output": "Limit order placed at 65,240. Stop at 63,800...",
        "tool_calls": 2
      }
    ]
  }
}

Common questions

Is a reasoning trace the same as chain-of-thought?

They are related but not the same. Chain-of-thought is a prompting technique that asks a model to think step by step before answering. A reasoning trace is the recorded artifact of a real decision process: the phased analysis, the tool calls, the confidence scores, and the link to the action actually taken. A trace can contain chain-of-thought reasoning, but it is captured data, not a prompting trick.

Why do reasoning traces matter more in trading than in math?

In math, a wrong answer is clearly wrong, so outcome checking works. In markets, a bad decision can produce a profit through luck and a good decision can lose money through variance. The reasoning trace is the only way to tell those cases apart with limited samples; you evaluate whether the process was sound, regardless of how this particular outcome landed.

Can reasoning traces be reconstructed after the fact?

Not reliably. Post-hoc rationalization is not the same as real reasoning; people and models both invent tidy explanations after the outcome is known. A useful trace is captured as the decision happens and is linked to the actual execution, so the stated reasoning and the action taken cannot drift apart.

Related terms

UV Labs builds post-training decision data for financial AI. Explore Decision Data →