run - Run Benchmarks¶
The primary command for executing benchmarks. Supports interactive mode, direct execution, and various execution phases.
Basic Syntax¶
benchbox run [OPTIONS]
Core Options¶
Platform and Benchmark Selection:
--platform TEXT: Platform typeSQL platforms:
duckdb,sqlite,databricks,bigquery,snowflake,redshift,clickhouse,datafusion,polars,trino,prestoDataFrame platforms:
polars-df,pandas-df,pyspark-df,datafusion-df(see DataFrame Platforms)
--benchmark TEXT: Benchmark name (tpch, tpcds, tpcdi, ssb, clickbench, h2odb, etc.)--scale FLOAT: Scale factor for data generation (default: 0.01)
Execution Control:
--phases TEXT: Benchmark phases to run (default: “power”)Available phases:
generate- Generate benchmark data filesload- Load data into databasewarmup- Run warmup queries (optional, prepares caches)power- Sequential query execution (Power Test)throughput- Concurrent query execution (Throughput Test)maintenance- Data modification operations (DATABASE RELOAD REQUIRED AFTER) - See TPC-H Maintenance or TPC-DS Maintenance
Use comma-separated list:
--phases generate,load,powerWarning: Maintenance phase permanently modifies data; reload database before running power/throughput again
--force [MODE]: Force regeneration of data (modes:all,datagen,upload, ordatagen,upload)--non-interactive: Use defaults for all prompts (useful for automation)
Output and Paths:
--output TEXT: Output directory (supports local paths and cloud storage: s3://, gs://, abfss://)--dry-run OUTPUT_DIR: Preview configuration and queries without execution
Query Configuration:
--seed INTEGER: An RNG seed for query parameter generation (power/throughput tests)--queries TEXT: Run specific queries in specified order (comma-separated IDs: “1,6,17”)Constraints:
Max 100 queries per run
Alphanumeric IDs only (letters, numbers, dash, underscore; max 20 chars each)
Only works with
powerandstandardphases (errors if used with ONLY incompatible phases)Duplicates automatically removed
Order preserved as specified
TPC Compliance: Breaks official TPC-H/TPC-DS compliance (overrides stream permutations)
Valid Ranges: TPC-H (1-22), TPC-DS (1-99), SSB (1-13)
Examples:
# Single query for debugging benchbox run --platform duckdb --benchmark tpch --queries "Q6" # Multiple queries in specific order benchbox run --platform postgres --benchmark tpch --queries "Q1,Q6,Q17" # With verbose logging for development benchbox run --platform duckdb --benchmark tpcds --queries "Q42" --verbose --phases power
Tuning and Optimization:
--tuning TEXT: Tuning mode (default: “notuning”)tuned: Enable optimizations (constraints for SQL, parallelism for DataFrame)notuning: Disable for baseline comparisonauto: Auto-detect optimal settings based on system profilePath to config file: Custom tuning configuration (YAML)
Data Compression (Advanced):
--compression TEXT: Compression configuration (e.g.,zstd,zstd:9,gzip:6,none)Format:
TYPEorTYPE:LEVELTypes:
zstd(default),gzip,noneLevels: 1-9 for gzip, 1-22 for zstd
Validation (Advanced):
--validation TEXT: Validation modeexact: Exact row count validation (default)loose: Loose validation (±5% tolerance)range: Min/max bounds validationdisabled: No validationfull: All validation checks enabled (preflight, postgen, postload, platform check)preflight,postgen,postload,check-platforms: Individual validation types
Platform-Specific Configuration:
--platform-option KEY=VALUE: Platform-specific option (can be used multiple times)Use
benchbox platforms status <name>to view platform detailsUniversal keys available on all platforms:
driver_version- pin the Python driver package version (e.g.1.2.0)driver_auto_install- auto-install the requested driver via uv if missing (true/false)
Athena Spark only:
engine_version- select Spark engine version (e.g.PySpark engine version 3); auto-detected on all other cloud platforms
Logging and Output:
--verbose,-v: Enable verbose logging (use-vvfor very verbose)--quiet,-q: Suppress all output (overrides verbose flags)
Help System:
-h, --help: Show common options--help-topic all: Show all options including advanced--help-topic examples: Show categorized usage examples
Usage Examples¶
Basic Local Benchmark¶
# Simple TPC-H benchmark on DuckDB
benchbox run --platform duckdb --benchmark tpch --scale 0.01
# With default settings
benchbox run --platform duckdb --benchmark tpch
Cloud Platform Examples¶
# Databricks with custom output location
benchbox run --platform databricks --benchmark tpch --scale 0.1 \
--output dbfs:/Volumes/workspace/raw/benchmarks/
# BigQuery with verbose logging
benchbox run --platform bigquery --benchmark tpcds --scale 0.01 \
--verbose --phases power
# Snowflake with tuning enabled
benchbox run --platform snowflake --benchmark tpch --scale 1 \
--tuning tuned
DataFrame Platform Examples¶
# Polars DataFrame with auto-tuning
benchbox run --platform polars-df --benchmark tpch --scale 1 \
--tuning auto
# Pandas DataFrame with PyArrow backend
benchbox run --platform pandas-df --benchmark tpch --scale 0.1 \
--platform-option dtype_backend=pyarrow
# PySpark DataFrame (local mode)
benchbox run --platform pyspark-df --benchmark tpch --scale 1 \
--platform-option driver_memory=8g
# Compare SQL vs DataFrame on same workload
benchbox run --platform polars --benchmark tpch --scale 0.1 # SQL mode
benchbox run --platform polars-df --benchmark tpch --scale 0.1 # DataFrame mode
Advanced Phase Control¶
# Data generation only
benchbox run --benchmark tpch --scale 0.1 --phases generate \
--output ./tpch-data
# Load data into database
benchbox run --platform duckdb --benchmark tpch --scale 0.1 \
--phases load
# Full TPC-DS power and throughput test
benchbox run --platform duckdb --benchmark tpcds --scale 0.01 \
--phases power,throughput --seed 42
Dry Run and Preview¶
# Preview configuration without execution
benchbox run --platform duckdb --benchmark tpch --scale 0.1 \
--dry-run ./preview
# Preview with specific seed for reproducible queries
benchbox run --platform snowflake --benchmark tpcds \
--phases power --dry-run ./debug --seed 7
Platform-Specific Options¶
# ClickHouse with local mode and TLS
benchbox run --platform clickhouse --benchmark tpch \
--platform-option mode=local \
--platform-option secure=true
# Pin driver version (any platform)
benchbox run --platform duckdb --benchmark tpch \
--platform-option driver_version=1.2.0 \
--platform-option driver_auto_install=true
# Athena Spark: select Spark engine version
benchbox run --platform athena-spark --benchmark tpch --scale 1.0 \
--platform-option workgroup=my-spark-workgroup \
--platform-option s3_staging_dir=s3://my-bucket/benchbox \
--platform-option "engine_version=PySpark engine version 3"
# Show platform details and capabilities
benchbox platforms status clickhouse
Table Storage Options¶
Table Mode:
--table-mode [native|external]: How tables are created (default:native)native: Standard materialized tables in the target databaseexternal: External table or view references (e.g., Parquet-backed)Note: Incompatible with
--tuning tunedwhen set toexternal
Table Format:
--table-format TEXT: Open table format for data filesFormat:
FORMAT[:COMPRESSION[:PARTITION:COLS...]]Examples:
parquet,delta:snappy,iceberg:zstd,vortexSupported formats depend on the platform adapter
Pre-sorted Data:
--presort [parquet-sorted|delta-sorted|iceberg-sorted]: Pre-sort data into open table formatsOnly valid for TPC-H and TPC-DS benchmarks
TPC-H sorts on
lineitem.l_shipdate; TPC-DS sorts onstore_sales.ss_sold_date_sk
Cloud Sorted Ingestion Options¶
These flags control cloud data-organization strategy selection across supported cloud warehouses. They are advanced options — use --help-topic all to see them in the CLI help.
--sorted-ingestion-mode [off|auto|force]: Cloud sorted-ingestion strategy mode--sorted-ingestion-method [auto|ctas|z_order|hilbert|liquid_clustering|vacuum_sort]: Override the clustering method used during sorted ingestion
# Snowflake with forced CTAS-based sorted ingestion
benchbox run --platform snowflake --benchmark tpch --scale 1 \
--sorted-ingestion-mode force \
--sorted-ingestion-method ctas
Databricks Clustering Options¶
These overrides are Databricks-specific and are passed via --platform-option.
--platform-option databricks_clustering_strategy=[z_order|liquid_clustering|none]: SQL tuning strategy for Databricks tables--platform-option liquid_clustering_columns=col1,col2: Comma-separated liquid clustering columns
# Databricks with Z-order clustering
benchbox run --platform databricks --benchmark tpch --scale 1 \
--platform-option databricks_clustering_strategy=z_order
# Databricks with liquid clustering on specific columns
benchbox run --platform databricks --benchmark tpch --scale 1 \
--platform-option databricks_clustering_strategy=liquid_clustering \
--platform-option liquid_clustering_columns=l_shipdate,l_orderkey
Execution Control Options¶
These flags control monitoring, progress display, memory handling, and caching behavior. They are advanced options — use --help-topic all to see them in the CLI help.
--mode [sql|dataframe]: Force execution mode (validates against platform capabilities)--no-monitoring: Disable metrics collection during benchmark execution--no-progress: Disable progress bars--ignore-memory-warnings: Proceed with execution despite low memory warnings--global-cache: Use shared data cache at~/.benchbox/datagen/instead of project-localbenchmark_runs/directory
Query Plan Capture Options¶
--plan-config TEXT: Fine-grained control over query plan captureFormat: comma-separated key:value pairs
Keys:
sample:FLOAT— fraction of queries to capture plans for (e.g.,sample:0.1for 10%)first:INT— capture plans for the first N queries onlyqueries:ID1,ID2,...— capture plans for specific query IDsstrict:true|false— fail if plan capture encounters errors
Requires
--capture-plansto also be setExample:
--capture-plans --plan-config "sample:0.1,strict:true"Note: On DuckDB, plan capture uses
EXPLAIN (ANALYZE, FORMAT JSON)which roughly doubles query execution time. Use--platform-option analyze_plans=falsefor estimated plans without the overhead.