Skip to content

banner

tradedesk-miner

CI Build

High-performance, agent-operable data-mining engine for historical financial OHLCV data. Scans cached candle data (typically Dukascopy bid/ask CSVs prepared by tradedesk-dukascopy) and surfaces statistical anomalies, cross-instrument relationships, and seasonality effects as raw candidate findings. The primary consumer is a Quant agent, which turns those findings into testable trading-strategy hypotheses.

tradedesk-dukascopy (cache)  →  tradedesk-miner (raw findings)
                             →  Quant agent (hypotheses)
                             →  tradedesk based app (strategies, backtests, live)

Status

The v1 scan engine ships 23 scans across three families (single-instrument anomaly, two-instrument cross, seasonality) under a locked Finding JSON envelope, with a parallel sweep runner that applies bootstrap CIs, null distributions, and Benjamini-Hochberg FDR. MCP and HTTP wrappers are documented and deferred — see docs/future_mcp_http.md.

Install (prebuilt binary — no toolchain required)

Download the right tarball for your platform from the latest GitHub Release, extract, and place miner on $PATH:

# Example: Linux x86_64. Substitute the asset for your platform from the
# release page; verify against SHA256SUMS in the same release.
curl -fsSL -O https://github.com/radiusred/tradedesk-miner/releases/latest/download/SHA256SUMS
curl -fsSL -O https://github.com/radiusred/tradedesk-miner/releases/latest/download/miner-1.0.0-x86_64-unknown-linux-gnu.tar.gz
shasum -a 256 -c SHA256SUMS --ignore-missing
tar -xzf miner-1.0.0-x86_64-unknown-linux-gnu.tar.gz
install -m 0755 miner-1.0.0-x86_64-unknown-linux-gnu/miner ~/.local/bin/miner
miner --version

Targets currently published per release: x86_64-unknown-linux-gnu, aarch64-unknown-linux-gnu, aarch64-apple-darwin, x86_64-apple-darwin.

Build from source

Prerequisites: Rust 1.85+ stable (rustup default 1.85) and git.

git clone https://github.com/radiusred/tradedesk-miner
cd tradedesk-miner
./scripts/install-git-hooks.sh   # one-time: wires the cargo fmt + clippy pre-commit gate
cargo build --workspace
cargo test --workspace

install-git-hooks.sh points core.hooksPath at the tracked .githooks/ directory so the local pre-commit hook mirrors the CI fmt + clippy gates. Overrides: MINER_AUTOFIX=1 lets the hook re-stage fmt fixes and continue; MINER_SKIP_CLIPPY=1 skips clippy for fast WIP commits (CI still enforces the gate); git commit --no-verify bypasses everything.

Example

The repo ships a synthetic-cache generator (deterministic, no external download) rather than the cache bytes. Generate it once after cloning:

bash scripts/generate-fixture-cache.sh

That populates ./tests/fixtures/cache/EURUSD/… + …/GBPUSD/… and writes a SHA256SUMS manifest you can re-verify any time with (cd tests/fixtures/cache && sha256sum -c SHA256SUMS). The bytes are byte-identical across machines (Numerical Recipes LCG + single-threaded zstd-3) and are gitignored to keep the repo lean.

Then run a scan over the populated cache and stream NDJSON Finding envelopes to stdout:

MINER_CACHE_ROOT=./tests/fixtures/cache \
MINER_BAR_CACHE_ROOT=/tmp/bar \
MINER_OUTPUT=stdout \
cargo run -p miner-cli -- scan seas.bucket.hour_of_day@1 \
    --instrument EURUSD:bid --timeframe 15m \
    --window 2024-01-01:2024-01-31

Truncated Result envelope:

{"kind":"result","scan_id@version":"stats.autocorr.ljung_box@1",
 "effect":{"metric":"ljung_box_q_stat","value":33.87,"p_value":0.043,
           "extra":{"lags":10,"acf":[...]}},
 "data_slice":{"sources":[{"symbol":"EURUSD","side":"bid",...}]}, ...}

For the full catalogue of 23 scans, the sweep-manifest TOML grammar, the exit code routing, and a programmatic-consumption walkthrough, see docs/agent_integration.md, docs/scan_catalogue.md, and docs/sweep_manifest.md.

Data source caveats

tradedesk-miner reads the cache layout tradedesk-dukascopy produces. A few non-obvious conventions matter for interpreting findings:

  • Months are 00-indexed on disk (2024/00/ = January, 2024/11/ = December).
  • The volume column is a tick count, not lot volume.
  • Bid and ask sides are processed independently; spread reconstruction is out of scope.
  • Weekend and exchange-holiday gaps are intentional, not missing data; the gap_policy flag controls how scans treat them.

See docs/data_sources.md for the full reference, including the data licensing posture.

Performance

Wall-clock numbers, allocation budget, and reference flamegraph live in BENCHMARKING.md. The README intentionally avoids embedded benchmark numbers — they go stale fast.

Design principles

  • Locked Finding envelope. Seven-variant tagged enum (run_start, result, scan_error, gap_aborted, dry_run, sweep_summary, run_end) with frozen common fields. Schema-additive only; ground truth is schemas/findings-v1.schema.json.
  • Stdout = findings, stderr = logs. StdoutSink is the only writer to io::stdout() in the workspace; tracing routes structured logs to stderr. CI-enforced via clippy::disallowed_macros.
  • Tokio-free miner-core. The scan engine is sync + rayon only; async lives only at the wrapper edges via spawn_blocking. CI-enforced by a cargo tree -p miner-core gate.
  • Config precedence. CLI flag > env var > TOML file > error. No hardcoded paths in the library; the CLI owns config-path resolution.
  • Envelope-byte determinism. Re-runs with the same seed against the same code revision + cache produce byte-identical NDJSON once the four volatile fields (run_id, timestamps, wall_clock_ms) are masked.

Architecture

See docs/architecture.md for the system map. The deeper layered design rationale lives in the internal planning tree (.planning/research/ARCHITECTURE.md) and is not part of the published docs.

Roadmap

See .planning/ROADMAP.md. Plan-by-plan summaries live under .planning/phases/<phase>/<phase>-<plan>-SUMMARY.md.

Documentation

Start with:

Runnable examples live under docs/examples/.

Contributing

See CONTRIBUTING.md for development setup, quality gates, and PR expectations.

License

Licensed under the Apache License, Version 2.0. See: https://www.apache.org/licenses/LICENSE-2.0

Copyright 2026 Radius Red Ltd. | Contact