On August 1, 2012, Knight Capital's trading systems executed 4 million trades in 45 minutes, losing the firm $440 million. The cause wasn't a bad trading strategy. It was an infrastructure failure: old code accidentally activated on one of eight servers, sending erroneous orders at machine speed.
Knight had some of the most sophisticated trading algorithms on Wall Street. The failure was in execution, the unglamorous plumbing that connects decisions to markets. Getting AI to think about trades is one problem. Getting it to actually execute them reliably is a different, harder problem.
Most AI research focuses on model architecture and training data. But the gap between "AI that can discuss markets" and "AI that can reliably place orders" is enormous. Closing it requires infrastructure that most research doesn't even consider.
The Execution Problem
The path from intention to completed transaction involves many steps, each with potential failure modes:
- Intention: AI decides "I want to go long BTC"
- Specification: What size? What price? What conditions?
- Tool selection: Which API? Which order type?
- Parameter formation: Correct formats, units, precision
- Execution: API call, network, authentication
- Confirmation: Did it fill? At what price? Partial or complete?
- State update: Update position tracking, available capital, risk metrics
Current LLMs struggle at multiple points in this chain. They'll specify impossible parameters, call functions that don't exist, or fail to handle partial fills correctly.
Models that ace financial analysis questions in benchmarks often fail at basic execution tasks. The capability gap between "understanding" and "doing" is enormous.
Tool Selection and Integration
Financial AI needs access to actual execution tools. This creates several challenges:
API Diversity
Every exchange, broker, and platform has different APIs. Different authentication, different endpoints, different parameter formats. An AI that works with Binance won't automatically work with Coinbase or traditional brokers.
Order Type Complexity
Beyond simple market orders:
- Limit orders with various time-in-force options
- Stop orders (stop-market, stop-limit)
- Take-profit orders
- Trailing stops
- OCO (one-cancels-other) orders
- Scaled entry/exit orders
Each order type has appropriate use cases. The AI must learn not just how to use them, but when.
State Synchronization
The AI's view of positions, orders, and balances must match reality. Networks can delay, orders can partially fill, positions can be liquidated. State synchronization failures cause cascading problems.
Slippage and Execution Quality
The price you intend to trade at and the price you actually get are often different:
Market Orders
Fill at current best price, but that price moves. Large orders eat through multiple price levels. Slippage can significantly impact returns, especially in thin markets.
Limit Orders
Guarantee price but not fill. The order might never execute, partially fill, or fill after conditions have changed.
Latency Effects
Time between decision and execution matters. Prices move during API round-trips. In fast markets, the opportunity may disappear before the order reaches the exchange.
Training data must capture these realities. An AI trained on idealized execution will fail in production where fills are never exactly as expected.
// Training needs to show real execution
{
"intended_entry": 65000,
"actual_entry": 65027.50,
"slippage_bps": 4.2,
"partial_fill": false,
"latency_ms": 247
}
Order Management
Orders aren't fire-and-forget. Active management is required:
Monitoring
Track order status. Is it still active? Has it filled? Has the market moved away?
Modification
Circumstances change. Limit prices may need adjustment. Sizes may need modification. Some venues support in-place modification; others require cancel-and-replace.
Cancellation
When to cancel unfilled orders? Stale limits can fill at the worst time. But canceling too quickly leaves orders no time to execute.
Position Tracking
Multiple orders across time create positions. The AI must track:
- Average entry price
- Current size
- Unrealized P&L
- Associated stops and take-profits
- Margin/leverage utilization
Every order management interaction, every modification, every cancellation, is training signal. The AI learns not just to place orders but to manage them actively throughout their lifecycle.
Error Handling and Recovery
Things go wrong. APIs return errors. Networks fail. Exchanges have outages. Rate limits are hit. The AI must handle these gracefully:
Error Classification
Is the error transient (retry) or permanent (abort)? Is it a parameter problem (fix and retry) or a system problem (wait)?
Retry Logic
When to retry? How many times? With what backoff? Aggressive retries can trigger rate limits; passive retries miss opportunities.
Fallback Strategies
If the primary execution path fails, what's the alternative? Different venue? Different order type? Manual intervention?
State Recovery
After a failure, what's the actual state? Did the order go through before the connection dropped? Is there a position the AI doesn't know about?
Exchange-Specific Patterns
Different venues have different characteristics:
Liquidity Profiles
Some exchanges are deep; others are thin. Order sizing must adapt. A 100 BTC order might be routine on one venue and market-moving on another.
Fee Structures
Maker/taker fees vary. Some strategies only work with maker rebates. The AI must factor fees into decisions.
Available Instruments
Not all instruments trade everywhere. Perpetual futures differ from quarterly futures differ from spot. Leverage limits vary.
Operational Hours
Crypto trades 24/7. Traditional markets have sessions. Some venues have maintenance windows. Timing awareness is required.
Training Data Requirements
Building AI that can transact requires training data that captures all of this complexity:
Real Execution Traces
Not simulated fills but actual execution records with real slippage, real latency, real partial fills.
Error Scenarios
Examples of errors and appropriate responses. The AI learns from mistakes without having to make them all itself.
Tool Documentation
Accurate API documentation integrated into training so the AI knows what functions exist and how to use them.
State Management Examples
How to track positions, reconcile discrepancies, recover from failures.
Exchange-Specific Patterns
What works on each venue. Liquidity, fees, quirks, and operational details.
Why Execution Data Is Training Data
Every execution teaches something:
- Slippage patterns: How much slippage to expect at different sizes, times, volatilities
- Fill probabilities: How likely is a limit order at price X to fill?
- Timing effects: Does execution quality vary by time of day?
- Error frequencies: Which APIs fail most often? Under what conditions?
This isn't abstract knowledge that can be described. It must be learned from actual execution data, lots of it.
Financial AI that can actually transact requires infrastructure for execution, comprehensive tracking of execution quality, and training data that captures the messy reality of order management. The model is the smaller challenge. The execution layer is where production systems succeed or fail.