UV Labs - Train Models That Transact | Financial AI Training Data

The Data

01

Replayable Episodes

Every episode preserves exact market conditions, order book depth, and agent context. Train a model, replay the episode, measure improvement. Same inputs, better outputs.

02

Full Reasoning Traces

Not just what happened, but how and why. Chain-of-thought reasoning, multi-timeframe data, order book state, sentiment, and every tool call. Full supervision for full decisions.

03

Outcomes + Counterfactuals

Real P&L from real trades. Verified results plus counterfactual analysis: optimal exits, alternative strategies, luck-vs-skill breakdown. What happened and what should have.

Not Just Data. A Full RL Environment.

Stream episodes for training or plug directly into our environment via API. Replay historical decisions, run your agents against real market conditions, test strategies before deployment.

Streaming API Episode Replay Live Testing Python SDK

View Documentation

Data Architecture

Episode Structure

Each episode captures the complete decision lifecycle, from market state through reasoning to verified outcome.

01

Market Context

18 fields

Price Data

15m candles 1h candles 4h candles 1d candles

Order Book

spread_bps imbalance_ratio bid_depth institutional_activity

Indicators

RSI MACD ATR volatility_regime

02

Agent Reasoning

12 fields

Decision

explicit_reasoning decision_confidence

Thought Process

phase reasoning_type tool_calls

Graph State

checkpoints messages tool_results

03

Trade Execution

8 fields

Position

entry_price position_size_usd leverage

Risk Management

stop_loss take_profits[]

Metadata

exchange symbol direction

04

Position Journey

8 checkpoints

Time Series

1h 4h 8h 24h 48h 168h

Per Checkpoint

price pnl_percent mfe mae

05

Outcome

6 fields

Result

win/loss/breakeven realized_pnl realized_pnl_percent

Exit Details

exit_price hold_duration closed_at

06

Counterfactuals

10 fields

Optimal Exits

mfe_price mfe_pnl_percent optimal_pnl

Strategy Eval

trailing_2pct trailing_5pct timing_score

Error Flags

held_too_long exited_too_early

Every episode contains human intent · Select episodes include explicit feedback

Trainable Capabilities

TrainReal-WorldFinancial Capabilities

Train your models to transact with confidence. From payments to portfolio management, UV delivers the capabilities AI needs to operate as a trusted financial actor.

Reasoning Risk Timing Execution Sentiment Learning

Verification Depth

Full Data Access Annual -20%

01

Core

$4,000 /mo

Full decision sequences
Multi-timeframe context
Outcomes with P&L
Streaming + batch API

Get Started

02

Verified

$12,000 /mo

Everything in Core
Verified against prices
Counterfactual analysis
Position journey data

Get Started

03

Audited

$25,000 /mo

Everything in Verified
Complete reasoning traces
Step-by-step checkpoints
Signal quality labels

Get Started

04

Institutional

Custom

Everything in Audited
Custom filtering + assets
Early access to new data
Dedicated support

Contact Sales

Academic & Research Program

We offer flexible arrangements for universities, research labs, and individual researchers. Let's discuss how UV can support your work in financial AI.

PhD Students Research Labs Universities Consortiums

Schedule a Conversation Significant discounts available

FAQ

Data RL Integration Research

01 Data

What makes UV different from market data?

Market data shows what happened. UV shows how decisions were made: complete reasoning traces, tool calls, and verified outcomes. It's the difference between price feeds and decision supervision.

02 Data

What format is the data in?

Episodes are delivered as structured JSON via streaming API or batch export. Each episode includes market state, reasoning trace, action taken, and outcome. Compatible with standard ML pipelines.

03 Data

Who are the agents generating this data?

A mix of human traders and AI agents with varying strategies and skill levels. We include the full distribution (wins, losses, and breakevens) to avoid survivorship bias in your training data.

04 RL

Is this suitable for offline RL?

Yes. Episodes contain full state-action-reward-next_state tuples. The environment also supports online interaction for policy evaluation and live testing against real market conditions.

05 RL

How do you handle lookahead bias?

All market context is point-in-time. State snapshots reflect only information available at decision time. Outcomes and counterfactuals are computed post-hoc and clearly separated.

06 RL

What's the reward signal?

Realized P&L is the ground truth. Higher tiers include shaped rewards: risk-adjusted returns, timing scores, and counterfactual comparisons. You can also define custom reward functions via the API.

07 Integration

Is the environment Gym-compatible?

The Python SDK provides a Gym-style interface for episode replay and live interaction. Standard observation and action spaces with configurable wrappers for your architecture.

08 Research

Can I publish research using this data?

Yes. Academic and research use is encouraged. We provide anonymized datasets and can discuss data licensing for publications. See our Academic Program for discounted access.

Get Started

Ready to train models that transact?

Get access to decision sequences that teach your models how to reason about markets and make decisions that work.

Request Access View Documentation

Train Models That Transact

The Data

Replayable Episodes

Full Reasoning Traces

Outcomes + Counterfactuals

Not Just Data. A Full RL Environment.

Episode Structure

Market Context

Agent Reasoning

Trade Execution

Position Journey

Outcome

Counterfactuals

TrainReal-WorldFinancial Capabilities

Verification Depth

Academic & Research Program

FAQ

What makes UV different from market data?

What format is the data in?

Who are the agents generating this data?

Is this suitable for offline RL?

How do you handle lookahead bias?

What's the reward signal?

Is the environment Gym-compatible?

Can I publish research using this data?

Product

Get Started

Let's Talk