
tradedesk-miner¶
High-performance, agent-operable data-mining engine for historical financial OHLCV
data. Scans cached candle data (typically Dukascopy bid/ask CSVs prepared by
tradedesk-dukascopy) and surfaces statistical anomalies, cross-instrument
relationships, and seasonality effects as raw candidate findings. The primary
consumer is a Quant agent, which turns those findings into testable
trading-strategy hypotheses.
tradedesk-dukascopy (cache) → tradedesk-miner (raw findings)
→ Quant agent (hypotheses)
→ tradedesk based app (strategies, backtests, live)
Status¶
The v1 scan engine ships 23 scans across three families (single-instrument
anomaly, two-instrument cross, seasonality) under a locked Finding JSON
envelope, with a parallel sweep runner that applies bootstrap CIs, null
distributions, and Benjamini-Hochberg FDR. MCP and HTTP wrappers are
documented and deferred — see docs/future_mcp_http.md.
Install (prebuilt binary — no toolchain required)¶
Download the right tarball for your platform from the latest
GitHub Release,
extract, and place miner on $PATH:
# Example: Linux x86_64. Substitute the asset for your platform from the
# release page; verify against SHA256SUMS in the same release.
curl -fsSL -O https://github.com/radiusred/tradedesk-miner/releases/latest/download/SHA256SUMS
curl -fsSL -O https://github.com/radiusred/tradedesk-miner/releases/latest/download/miner-1.0.0-x86_64-unknown-linux-gnu.tar.gz
shasum -a 256 -c SHA256SUMS --ignore-missing
tar -xzf miner-1.0.0-x86_64-unknown-linux-gnu.tar.gz
install -m 0755 miner-1.0.0-x86_64-unknown-linux-gnu/miner ~/.local/bin/miner
miner --version
Targets currently published per release: x86_64-unknown-linux-gnu,
aarch64-unknown-linux-gnu, aarch64-apple-darwin, x86_64-apple-darwin.
Build from source¶
Prerequisites: Rust 1.85+ stable (rustup default 1.85) and git.
git clone https://github.com/radiusred/tradedesk-miner
cd tradedesk-miner
./scripts/install-git-hooks.sh # one-time: wires the cargo fmt + clippy pre-commit gate
cargo build --workspace
cargo test --workspace
install-git-hooks.sh points core.hooksPath at the tracked .githooks/
directory so the local pre-commit hook mirrors the CI fmt + clippy gates.
Overrides: MINER_AUTOFIX=1 lets the hook re-stage fmt fixes and continue;
MINER_SKIP_CLIPPY=1 skips clippy for fast WIP commits (CI still enforces
the gate); git commit --no-verify bypasses everything.
Example¶
The repo ships a synthetic-cache generator (deterministic, no external download) rather than the cache bytes. Generate it once after cloning:
That populates ./tests/fixtures/cache/EURUSD/… + …/GBPUSD/… and writes
a SHA256SUMS manifest you can re-verify any time with
(cd tests/fixtures/cache && sha256sum -c SHA256SUMS). The bytes are
byte-identical across machines (Numerical Recipes LCG + single-threaded
zstd-3) and are gitignored to keep the repo lean.
Then run a scan over the populated cache and stream NDJSON Finding
envelopes to stdout:
MINER_CACHE_ROOT=./tests/fixtures/cache \
MINER_BAR_CACHE_ROOT=/tmp/bar \
MINER_OUTPUT=stdout \
cargo run -p miner-cli -- scan seas.bucket.hour_of_day@1 \
--instrument EURUSD:bid --timeframe 15m \
--window 2024-01-01:2024-01-31
Truncated Result envelope:
{"kind":"result","scan_id@version":"stats.autocorr.ljung_box@1",
"effect":{"metric":"ljung_box_q_stat","value":33.87,"p_value":0.043,
"extra":{"lags":10,"acf":[...]}},
"data_slice":{"sources":[{"symbol":"EURUSD","side":"bid",...}]}, ...}
For the full catalogue of 23 scans, the sweep-manifest TOML grammar, the exit code routing, and a programmatic-consumption walkthrough, see docs/agent_integration.md, docs/scan_catalogue.md, and docs/sweep_manifest.md.
Data source caveats¶
tradedesk-miner reads the cache layout tradedesk-dukascopy produces. A few
non-obvious conventions matter for interpreting findings:
- Months are 00-indexed on disk (
2024/00/= January,2024/11/= December). - The
volumecolumn is a tick count, not lot volume. - Bid and ask sides are processed independently; spread reconstruction is out of scope.
- Weekend and exchange-holiday gaps are intentional, not missing data; the
gap_policyflag controls how scans treat them.
See docs/data_sources.md for the full reference, including the data licensing posture.
Performance¶
Wall-clock numbers, allocation budget, and reference flamegraph live in BENCHMARKING.md. The README intentionally avoids embedded benchmark numbers — they go stale fast.
Design principles¶
- Locked
Findingenvelope. Seven-variant tagged enum (run_start,result,scan_error,gap_aborted,dry_run,sweep_summary,run_end) with frozen common fields. Schema-additive only; ground truth isschemas/findings-v1.schema.json. - Stdout = findings, stderr = logs.
StdoutSinkis the only writer toio::stdout()in the workspace;tracingroutes structured logs to stderr. CI-enforced viaclippy::disallowed_macros. - Tokio-free
miner-core. The scan engine is sync +rayononly; async lives only at the wrapper edges viaspawn_blocking. CI-enforced by acargo tree -p miner-coregate. - Config precedence. CLI flag > env var > TOML file > error. No hardcoded paths in the library; the CLI owns config-path resolution.
- Envelope-byte determinism. Re-runs with the same seed against the
same code revision + cache produce byte-identical NDJSON once the four
volatile fields (
run_id, timestamps,wall_clock_ms) are masked.
Architecture¶
See docs/architecture.md for the system map.
The deeper layered design rationale lives in the internal planning tree
(.planning/research/ARCHITECTURE.md) and is not part of the published docs.
Roadmap¶
See .planning/ROADMAP.md. Plan-by-plan summaries
live under .planning/phases/<phase>/<phase>-<plan>-SUMMARY.md.
Documentation¶
Start with:
- docs/architecture.md
- docs/findings_envelope.md
- docs/scan_catalogue.md
- docs/sweep_manifest.md
- docs/agent_integration.md
- docs/future_mcp_http.md
Runnable examples live under docs/examples/.
Contributing¶
See CONTRIBUTING.md for development setup, quality gates, and PR expectations.
License¶
Licensed under the Apache License, Version 2.0. See: https://www.apache.org/licenses/LICENSE-2.0
Copyright 2026 Radius Red Ltd. | Contact