Configuration¶
BenchBox supports configuration through multiple sources with the following precedence:
Command-line arguments (highest priority)
Environment variables
Configuration files
Default values (lowest priority)
Configuration File Format¶
BenchBox uses YAML configuration files. Default location: ~/.benchbox/config.yaml
Example configuration:
# Output settings
output:
compression:
enabled: true
type: zstd
level: 3
formats:
- json
- csv
# Platform settings
platforms:
databricks:
enabled: true
warehouse_id: "abc123"
bigquery:
enabled: true
project_id: "my-project"
# Tuning settings
tuning:
default_mode: notuning
enable_constraints: false
Platform Configuration¶
Each platform can have specific configuration options:
Databricks:
platforms:
databricks:
warehouse_id: "warehouse-id"
catalog: "main"
schema: "benchbox"
BigQuery:
platforms:
bigquery:
project_id: "my-project"
dataset: "benchbox"
location: "US"
Snowflake:
platforms:
snowflake:
account: "account-name"
warehouse: "COMPUTE_WH"
database: "BENCHBOX"
schema: "PUBLIC"
Environment Variables¶
BenchBox recognizes these environment variables:
Core Settings¶
These variables override the corresponding settings in the configuration file (equivalent to editing ~/.benchbox/config.yaml):
Variable |
Config Path |
Type |
Default |
Description |
|---|---|---|---|---|
|
|
string |
|
Preferred database platform |
|
|
float |
|
Default scale factor |
|
|
boolean |
|
Enable verbose output ( |
|
|
integer |
|
Maximum parallel worker threads |
|
|
boolean |
|
Enable table tuning by default |
|
|
string |
- |
Path to tuning configuration file |
|
|
string |
|
Output directory for result files |
|
|
integer |
|
Memory limit in GB; |
Boolean parsing: true, 1, yes, on (case-insensitive) are treated as true.
General Settings¶
BENCHBOX_NON_INTERACTIVE=true: Enable non-interactive modeBENCHBOX_NO_COMPRESSION=true: Disable data compressionBENCHBOX_CONFIG_PATH=/path/to/config.yaml: Custom config file location
Advanced Settings¶
These variables control lower-level behaviors, useful for CI, offline environments, or advanced workflows:
Variable |
Description |
|---|---|
|
Override the default local data directory for generated benchmark files |
|
Override the cache directory for DataFrame benchmark data |
|
Override the tuning file search root; BenchBox looks for |
|
TPC-DS query validation mode: |
|
Set to |
|
Override the base URL for downloading expected-answer archives (default: GitHub releases) |
|
Set to |
|
Maximum bytes held in memory when sorting generated data files (integer; default: 512 MB) |
Platform Authentication¶
Databricks:
DATABRICKS_TOKEN: Authentication tokenDATABRICKS_HOST: Workspace URL
BigQuery:
GOOGLE_APPLICATION_CREDENTIALS: Service account key file path
Snowflake:
SNOWFLAKE_USER: UsernameSNOWFLAKE_PASSWORD: PasswordSNOWFLAKE_ACCOUNT: Account identifier
Redshift:
AWS_ACCESS_KEY_ID: AWS access keyAWS_SECRET_ACCESS_KEY: AWS secret key
ClickHouse:
CLICKHOUSE_HOST: Server hostnameCLICKHOUSE_USER: UsernameCLICKHOUSE_PASSWORD: Password
Platform-Specific Options¶
Each platform supports specific options via --platform-option KEY=VALUE:
Universal Keys (All Platforms)¶
These keys are available for every platform:
Key |
Description |
|---|---|
|
Pin the Python driver package to a specific version (e.g. |
|
When |
Note
uv run benchbox run syncs the environment to uv.lock before Python starts, which
can silently revert a version you installed manually. Use driver_version +
driver_auto_install=true or uv run --with "pkg==X" to reliably test a specific
version. See Driver Version Management for the full guide.
Example: pin DuckDB driver and auto-install it:
benchbox run --platform duckdb --benchmark tpch \
--platform-option driver_version=1.2.0 \
--platform-option driver_auto_install=true
Example: pin Snowflake connector version:
benchbox run --platform snowflake --benchmark tpch \
--platform-option driver_version=3.12.0 \
--platform-option driver_auto_install=true \
--platform-option account=xy12345.us-east-1 \
--platform-option warehouse=COMPUTE_WH
Athena Spark Engine Version¶
For Athena Spark only, the Spark engine version can be explicitly selected:
Key |
Description |
|---|---|
|
Spark engine version string (e.g. |
benchbox run --platform athena-spark --benchmark tpch \
--platform-option workgroup=my-spark-workgroup \
--platform-option s3_staging_dir=s3://my-bucket/benchbox \
--platform-option "engine_version=PySpark engine version 3"
ClickHouse Options¶
mode=local: Use local ClickHouse instancesecure=true: Enable TLS encryptionport=9000: Custom port numberdatabase=default: Target database name
Example:
benchbox run --platform clickhouse --benchmark tpch \
--platform-option mode=local \
--platform-option secure=true \
--platform-option port=9440
View Platform Details¶
Use benchbox platforms status to see platform information and capabilities:
benchbox platforms status clickhouse
benchbox platforms status databricks
Configuration Sections Reference¶
Complete reference for all settings in ~/.benchbox/config.yaml (or ./benchbox.yaml). Unset values fall back to the defaults shown here.
system¶
System profiling and detection settings.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Automatically detect system capabilities on startup |
|
boolean |
|
Persist the detected system profile to disk |
|
integer |
|
Hours to cache the system profile before re-detecting |
database¶
Database connection settings.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
string |
|
Default platform when none is specified on the CLI |
|
integer |
|
Connection timeout in seconds |
|
boolean |
|
Automatically detect available database platforms |
benchmarks¶
Default benchmark execution parameters.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
float |
|
Default scale factor |
|
integer |
|
Maximum benchmark execution time |
|
integer |
|
Maximum memory allocation hint in GB |
|
boolean |
|
Continue running remaining queries when one fails |
output¶
Result output and export settings.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
list |
|
Output format list |
|
string |
|
Results directory |
|
string |
|
Timestamp format for result filenames |
|
boolean |
|
Automatically submit results to the hosted service |
|
string |
|
Results service URL |
|
boolean |
|
Enable result file compression |
|
string |
|
Compression algorithm ( |
|
integer |
|
Compression level; |
execution¶
Performance and execution control settings.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Execute queries in parallel |
|
integer |
|
Maximum worker threads for parallel execution |
|
integer |
|
Memory cap in GB; |
|
boolean |
|
Enable verbose progress output |
execution.power_run - settings for multi-iteration power runs:
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
integer |
|
Number of measurement iterations per query |
|
integer |
|
Warm-up iterations before measurement (not included in stats) |
|
integer |
|
Timeout for each iteration |
|
boolean |
|
Stop immediately when any iteration fails |
|
boolean |
|
Collect resource metrics during execution |
execution.concurrent_queries - settings for throughput / concurrent-stream runs:
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Enable concurrent query streams |
|
integer |
|
Number of concurrent query streams |
|
integer |
|
Per-query timeout in concurrent mode |
|
integer |
|
Timeout for an entire concurrent stream |
|
boolean |
|
Retry failed queries in concurrent streams |
|
integer |
|
Maximum retry attempts per query |
tuning¶
Table tuning and optimization settings.
Setting |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Apply tuning configurations by default |
|
string |
|
Path to a default tuning YAML file |
|
boolean |
|
Validate tuning configurations when loaded |
|
boolean |
|
Allow tuning configs that contain platform-incompatible directives |
Example full configuration file:
system:
profile_cache_hours: 48
database:
preferred: duckdb
benchmarks:
default_scale: 1.0
timeout_minutes: 120
continue_on_error: true
output:
directory: ./results
compression:
enabled: true
type: zstd
execution:
max_workers: 8
memory_limit_gb: 16
power_run:
iterations: 5
warm_up_iterations: 2
tuning:
enabled: true
default_config_file: ./tuning/my_tuning.yaml