Machine learning guide (tradedesk.ml)¶
This guide is a compact reference for the optional machine-learning surface in
tradedesk.
Installation¶
ML support is optional:
The [ml] extra installs the dependencies used by model and reporting
components, including xgboost, scikit-learn, and joblib.
What tradedesk.ml includes¶
The public ML package is organized around four building blocks:
FeatureBuilderandFeatureConfigfor feature engineering over time-indexed OHLC(V) data, with optional bid/ask-aware featuresforward_return_labels(...)andtriple_barrier_labels(...)for supervised label generationWalkForwardSplitterandwalk_forward_evaluate(...)for leakage-aware walk-forward evaluationDirectionClassifierandMLDirectionStrategyfor model training and event loop integration
Import these surfaces from:
from tradedesk.ml import (
FeatureBuilder,
FeatureConfig,
LabelConfig,
WalkForwardConfig,
WalkForwardSplitter,
forward_return_labels,
walk_forward_evaluate,
)
from tradedesk.ml.model import DirectionClassifier, DirectionClassifierConfig
from tradedesk.strategy import MLDirectionStrategy
Feature engineering¶
FeatureBuilder turns a time-indexed pandas.DataFrame of bars into a feature
matrix suitable for training or inference.
Built-in features include:
- Lagged log returns over multiple horizons
- Rolling realised volatility and higher moments
- Time-of-day and weekday encodings
- Outputs from
tradedesk.marketdata.indicators - Optional microstructure features derived from candle shape and bid/ask spread
Example:
from tradedesk.ml import FeatureBuilder, FeatureConfig
builder = FeatureBuilder(config=FeatureConfig())
X = builder.transform(bars)
bars must use a monotonic DatetimeIndex and include the columns required by
the configured feature set.
Labels¶
tradedesk.ml.labels supports:
- Forward-return labels via
forward_return_labels(...) - Triple-barrier labels via
triple_barrier_labels(...) - Class-balance summaries for fold diagnostics
Label-specific usage and field semantics are documented in ml_labels_guide.md.
Walk-forward evaluation¶
WalkForwardSplitter produces ordered train/test folds for time-series model
evaluation. The splitter supports purge and embargo settings to reduce leakage
at fold boundaries.
Example:
from tradedesk.ml import WalkForwardConfig, WalkForwardSplitter
splitter = WalkForwardSplitter(
WalkForwardConfig(train_window=200_000, test_window=50_000, embargo=15, purge=15)
)
Use walk_forward_evaluate(...) when you want a metrics table across those
folds.
End-to-end example¶
from tradedesk.ml import (
FeatureBuilder,
FeatureConfig,
LabelConfig,
WalkForwardConfig,
WalkForwardSplitter,
forward_return_labels,
walk_forward_evaluate,
)
from tradedesk.ml.model import DirectionClassifier, DirectionClassifierConfig
builder = FeatureBuilder(config=FeatureConfig())
X = builder.transform(bars)
y_raw = forward_return_labels(bars, LabelConfig(horizon=15)).reindex(X.index)
valid = y_raw.notna()
X = X.loc[valid]
y = (y_raw.loc[valid] > 0).astype(int)
y.index = X.index
splitter = WalkForwardSplitter(
WalkForwardConfig(train_window=200_000, test_window=50_000, embargo=15, purge=15)
)
def make_model() -> DirectionClassifier:
return DirectionClassifier(DirectionClassifierConfig(n_estimators=200, n_jobs=4))
metrics = walk_forward_evaluate(X, y, splitter, make_model)
print(metrics[["fold", "accuracy", "auc", "sharpe", "trade_count"]])
Strategy integration¶
MLDirectionStrategy is the runtime bridge between a trained probability model
and the normal tradedesk event loop. It maintains a rolling history buffer,
builds features from incoming candles, converts model probabilities into
signals, and emits the same strategy events used elsewhere in the framework.
Use it when you want ML inference to live inside a standard BaseStrategy
workflow rather than in a separate orchestration layer.