Technical Deep-Dive

Building AI That Can Transact: The Infrastructure Challenge

March 2025 13 min read

TL;DR

The gap between "AI that can discuss markets" and "AI that can reliably execute orders" is enormous; the model is the smaller challenge, execution infrastructure is where systems succeed or fail.

Knight Capital lost $440M in 45 minutes from infrastructure failure, not bad strategy; execution has 7+ failure points from intention to state update
Training on idealized execution fails in production; data must capture real slippage, partial fills, latency, and API errors
Every execution teaches something: slippage patterns, fill probabilities, timing effects, and error frequencies that can't be described, only learned from data

On August 1, 2012, Knight Capital's trading systems executed 4 million trades in 45 minutes, losing the firm $440 million. The cause wasn't a bad trading strategy. It was an infrastructure failure: old code accidentally activated on one of eight servers, sending erroneous orders at machine speed.

Knight had some of the most sophisticated trading algorithms on Wall Street. The failure was in execution, the unglamorous plumbing that connects decisions to markets. Getting AI to think about trades is one problem. Getting it to actually execute them reliably is a different, harder problem.

Most AI research focuses on model architecture and training data. But the gap between "AI that can discuss markets" and "AI that can reliably place orders" is enormous. Closing it requires infrastructure that most research doesn't even consider.

The Execution Problem

The path from intention to completed transaction involves many steps, each with potential failure modes:

Intention: AI decides "I want to go long BTC"
Specification: What size? What price? What conditions?
Tool selection: Which API? Which order type?
Parameter formation: Correct formats, units, precision
Execution: API call, network, authentication
Confirmation: Did it fill? At what price? Partial or complete?
State update: Update position tracking, available capital, risk metrics

Current LLMs struggle at multiple points in this chain. They'll specify impossible parameters, call functions that don't exist, or fail to handle partial fills correctly.

The Reality

Models that ace financial analysis questions in benchmarks often fail at basic execution tasks. The capability gap between "understanding" and "doing" is enormous.

Tool Selection and Integration

Financial AI needs access to actual execution tools. This creates several challenges:

API Diversity

Every exchange, broker, and platform has different APIs. Different authentication, different endpoints, different parameter formats. An AI that works with Binance won't automatically work with Coinbase or traditional brokers.

Order Type Complexity

Beyond simple market orders:

Limit orders with various time-in-force options
Stop orders (stop-market, stop-limit)
Take-profit orders
Trailing stops
OCO (one-cancels-other) orders
Scaled entry/exit orders

Each order type has appropriate use cases. The AI must learn not just how to use them, but when.

State Synchronization

The AI's view of positions, orders, and balances must match reality. Networks can delay, orders can partially fill, positions can be liquidated. State synchronization failures cause cascading problems.

Slippage and Execution Quality

The price you intend to trade at and the price you actually get are often different:

Market Orders

Fill at current best price, but that price moves. Large orders eat through multiple price levels. Slippage can significantly impact returns, especially in thin markets.

Limit Orders

Guarantee price but not fill. The order might never execute, partially fill, or fill after conditions have changed.

Latency Effects

Time between decision and execution matters. Prices move during API round-trips. In fast markets, the opportunity may disappear before the order reaches the exchange.

Training data must capture these realities. An AI trained on idealized execution will fail in production where fills are never exactly as expected.

// Training needs to show real execution
{
  "intended_entry": 65000,
  "actual_entry": 65027.50,
  "slippage_bps": 4.2,
  "partial_fill": false,
  "latency_ms": 247
}

Order Management

Orders aren't fire-and-forget. Active management is required:

Monitoring

Track order status. Is it still active? Has it filled? Has the market moved away?

Modification

Circumstances change. Limit prices may need adjustment. Sizes may need modification. Some venues support in-place modification; others require cancel-and-replace.

Cancellation

When to cancel unfilled orders? Stale limits can fill at the worst time. But canceling too quickly leaves orders no time to execute.

Position Tracking

Multiple orders across time create positions. The AI must track:

Average entry price
Current size
Unrealized P&L
Associated stops and take-profits
Margin/leverage utilization

Training Signal

Every order management interaction, every modification, every cancellation, is training signal. The AI learns not just to place orders but to manage them actively throughout their lifecycle.

Error Handling and Recovery

Things go wrong. APIs return errors. Networks fail. Exchanges have outages. Rate limits are hit. The AI must handle these gracefully:

Error Classification

Is the error transient (retry) or permanent (abort)? Is it a parameter problem (fix and retry) or a system problem (wait)?

Retry Logic

When to retry? How many times? With what backoff? Aggressive retries can trigger rate limits; passive retries miss opportunities.

Fallback Strategies

If the primary execution path fails, what's the alternative? Different venue? Different order type? Manual intervention?

State Recovery

After a failure, what's the actual state? Did the order go through before the connection dropped? Is there a position the AI doesn't know about?

Exchange-Specific Patterns

Different venues have different characteristics:

Liquidity Profiles

Some exchanges are deep; others are thin. Order sizing must adapt. A 100 BTC order might be routine on one venue and market-moving on another.

Fee Structures

Maker/taker fees vary. Some strategies only work with maker rebates. The AI must factor fees into decisions.

Available Instruments

Not all instruments trade everywhere. Perpetual futures differ from quarterly futures differ from spot. Leverage limits vary.

Operational Hours

Crypto trades 24/7. Traditional markets have sessions. Some venues have maintenance windows. Timing awareness is required.

Training Data Requirements

Building AI that can transact requires training data that captures all of this complexity:

Real Execution Traces

Not simulated fills but actual execution records with real slippage, real latency, real partial fills.

Error Scenarios

Examples of errors and appropriate responses. The AI learns from mistakes without having to make them all itself.

Tool Documentation

Accurate API documentation integrated into training so the AI knows what functions exist and how to use them.

State Management Examples

How to track positions, reconcile discrepancies, recover from failures.

Exchange-Specific Patterns

What works on each venue. Liquidity, fees, quirks, and operational details.

Why Execution Data Is Training Data

Every execution teaches something:

Slippage patterns: How much slippage to expect at different sizes, times, volatilities
Fill probabilities: How likely is a limit order at price X to fill?
Timing effects: Does execution quality vary by time of day?
Error frequencies: Which APIs fail most often? Under what conditions?

This isn't abstract knowledge that can be described. It must be learned from actual execution data, lots of it.

The Bottom Line

Financial AI that can actually transact requires infrastructure for execution, comprehensive tracking of execution quality, and training data that captures the messy reality of order management. The model is the smaller challenge. The execution layer is where production systems succeed or fail.

Need Execution Infrastructure?

UV Labs provides integrated environments where AI executes real transactions.

Schedule a Conversation

The Execution Problem

Tool Selection and Integration

API Diversity

Order Type Complexity

State Synchronization

Slippage and Execution Quality

Market Orders

Limit Orders

Latency Effects

Order Management

Monitoring

Modification

Cancellation

Position Tracking

Error Handling and Recovery

Error Classification

Retry Logic

Fallback Strategies

State Recovery

Exchange-Specific Patterns

Liquidity Profiles

Fee Structures

Available Instruments

Operational Hours

Training Data Requirements

Real Execution Traces

Error Scenarios

Tool Documentation

State Management Examples

Exchange-Specific Patterns

Why Execution Data Is Training Data

Continue Reading

Need Execution Infrastructure?