Advanced Topics¶
BenchBox exposes advanced tooling for teams that need to tune production workloads, run official TPC submissions, or analyse optimiser behavior. Use this section after you are comfortable with the 5‑minute start.
Quick Links¶
Area |
When to use it |
|---|---|
Measure statistical confidence, run throughput tests, and capture multi-stream metrics. |
|
Exercise rule-based and cost-based optimisers with focused SQL workloads. |
|
Track Baselines, attach monitoring snapshots to results, and detect regressions in CI. |
|
Convert benchmark data to Parquet, Delta Lake, or Iceberg for better performance. |
|
Build a new benchmark module, wire data generators, and expose it through the CLI. |
Recommended Workflow¶
Stabilise a baseline – Run the standard benchmark (
benchbox run --phases power) at the desired scale and export the JSON artifact.Enable monitoring – Follow the performance guide to capture snapshots and history so CI can flag regressions automatically.
Experiment with phases – Use
--phases generate,load,power,throughputor--phases maintenanceto isolate problem areas.Iterate safely – Store tuning YAML files under version control and pass them via
--tuning ./path/to/tuning.yaml.
Additional References¶
TPC reference material for official compliance work.
Data lake loading patterns for S3/GCS/ABFSS staging.
Compression strategies for large-scale data generation.
If you discover gaps or need new automation hooks, open a GitHub discussion so we can fold them into the roadmap.