UV Labs

Train Models That Transact

Financial AI
Post-Training
Decision Data

Your models need financial reasoning, not just financial facts. UV captures complete decision episodes so your models learn how expert agents think and trade.

The Data

01

Replayable Episodes

Every episode preserves exact market conditions, order book depth, and agent context. Train a model, replay the episode, measure improvement. Same inputs, better outputs.

02

Full Reasoning Traces

Not just what happened, but how and why. Chain-of-thought reasoning, multi-timeframe data, order book state, sentiment, and every tool call. Full supervision for full decisions.

03

Outcomes + Counterfactuals

Real P&L from real trades. Verified results plus counterfactual analysis: optimal exits, alternative strategies, luck-vs-skill breakdown. What happened and what should have.

Not Just Data. A Full RL Environment.

Stream episodes for training or plug directly into our environment via API. Replay historical decisions, run your agents against real market conditions, test strategies before deployment.

Streaming API Episode Replay Live Testing Python SDK
Data Architecture

Episode Structure

Each episode captures the complete decision lifecycle, from market state through reasoning to verified outcome.

01

Market Context

18 fields
Price Data
15m candles 1h candles 4h candles 1d candles
Order Book
spread_bps imbalance_ratio bid_depth institutional_activity
Indicators
RSI MACD ATR volatility_regime
02

Agent Reasoning

12 fields
Decision
explicit_reasoning decision_confidence
Thought Process
phase reasoning_type tool_calls
Graph State
checkpoints messages tool_results
03

Trade Execution

8 fields
Position
entry_price position_size_usd leverage
Risk Management
stop_loss take_profits[]
Metadata
exchange symbol direction
04

Position Journey

8 checkpoints
Time Series
1h 4h 8h 24h 48h 168h
Per Checkpoint
price pnl_percent mfe mae
05

Outcome

6 fields
Result
win/loss/breakeven realized_pnl realized_pnl_percent
Exit Details
exit_price hold_duration closed_at
06

Counterfactuals

10 fields
Optimal Exits
mfe_price mfe_pnl_percent optimal_pnl
Strategy Eval
trailing_2pct trailing_5pct timing_score
Error Flags
held_too_long exited_too_early
Every episode contains human intent · Select episodes include explicit feedback
Trainable Capabilities

TrainReal-WorldFinancial Capabilities

Train your models to transact with confidence. From payments to portfolio management, UV delivers the capabilities AI needs to operate as a trusted financial actor.

Reasoning Risk Timing Execution Sentiment Learning

Verification Depth

Full Data Access Annual -20%
01
Core
$4,000
  • Full decision sequences
  • Multi-timeframe context
  • Outcomes with P&L
  • Streaming + batch API
Get Started
02
Verified
$12,000
  • Everything in Core
  • Verified against prices
  • Counterfactual analysis
  • Position journey data
Get Started
04
Institutional
Custom
  • Everything in Audited
  • Custom filtering + assets
  • Early access to new data
  • Dedicated support
Contact Sales

Academic & Research Program

We offer flexible arrangements for universities, research labs, and individual researchers. Let's discuss how UV can support your work in financial AI.

PhD Students Research Labs Universities Consortiums
Schedule a Conversation Significant discounts available

FAQ

Data RL Integration Research
01 Data

What makes UV different from market data?

Market data shows what happened. UV shows how decisions were made: complete reasoning traces, tool calls, and verified outcomes. It's the difference between price feeds and decision supervision.

02 Data

What format is the data in?

Episodes are delivered as structured JSON via streaming API or batch export. Each episode includes market state, reasoning trace, action taken, and outcome. Compatible with standard ML pipelines.

03 Data

Who are the agents generating this data?

A mix of human traders and AI agents with varying strategies and skill levels. We include the full distribution (wins, losses, and breakevens) to avoid survivorship bias in your training data.

04 RL

Is this suitable for offline RL?

Yes. Episodes contain full state-action-reward-next_state tuples. The environment also supports online interaction for policy evaluation and live testing against real market conditions.

05 RL

How do you handle lookahead bias?

All market context is point-in-time. State snapshots reflect only information available at decision time. Outcomes and counterfactuals are computed post-hoc and clearly separated.

06 RL

What's the reward signal?

Realized P&L is the ground truth. Higher tiers include shaped rewards: risk-adjusted returns, timing scores, and counterfactual comparisons. You can also define custom reward functions via the API.

07 Integration

Is the environment Gym-compatible?

The Python SDK provides a Gym-style interface for episode replay and live interaction. Standard observation and action spaces with configurable wrappers for your architecture.

08 Research

Can I publish research using this data?

Yes. Academic and research use is encouraged. We provide anonymized datasets and can discuss data licensing for publications. See our Academic Program for discounted access.

Get Started
Ready to train models that transact?

Get access to decision sequences that teach your models how to reason about markets and make decisions that work.

Get in Touch

Let's Talk

Choose your preferred way to reach us