Troubleshooting Guide¶
This guide helps diagnose and resolve common issues when running BenchBox benchmarks.
Quick Diagnosis¶
Error Type Matrix¶
Error Message Contains |
Likely Cause |
Jump To |
|---|---|---|
“connection refused” |
Platform not running |
|
“authentication failed” |
Invalid credentials |
|
“catalog not found” |
Presto/Trino config |
|
“permission denied” |
Access rights |
|
“timeout” |
Slow query/network |
|
“out of memory” |
Scale too large |
|
“ImportError” |
Missing package |
|
“file not found” |
Data loading |
|
“command not found: benchbox” |
PATH issue |
Installation Issues¶
command not found: benchbox¶
Problem: The benchbox command is not available in your shell after installation.
Solution:
Check your PATH: Ensure that the Python scripts directory is in your system’s
PATH. You can find the directory by running:python -m site --user-base
Add the
binsubdirectory of that path to yourPATH.Reactivate your virtual environment: If you installed BenchBox in a virtual environment, make sure it’s activated:
source .venv/bin/activate
Missing Platform Dependencies¶
Problem: You get an ImportError when trying to use a specific database platform.
Solution:
Install the required dependencies for that platform:
# Specific platforms
pip install "benchbox[snowflake]"
pip install "benchbox[databricks]"
pip install "benchbox[bigquery]"
# All cloud platforms
pip install "benchbox[cloud]"
# DataFrame platforms
pip install "benchbox[dataframe]"
Check installation status:
benchbox platforms list # Shows available/unavailable
Shell Reports no matches found¶
Problem: Shells such as zsh treat square brackets as glob patterns, producing errors like zsh: no matches found: benchbox[cloud].
Solution: Use modern uv syntax (no quotes needed):
uv add benchbox --extra cloud
uv add benchbox --extra cloud --extra clickhouse
Or quote the pip-compatible syntax:
uv pip install "benchbox[cloud]"
python -m pip install "benchbox[cloud,clickhouse]"
Connection Issues¶
Connection Refused¶
Symptoms:
ConnectionRefusedError: [Errno 111] Connection refused
OperationalError: could not connect to server
Diagnosis:
# Check if service is running
curl -s http://localhost:3473/health # Firebolt Core
curl -s http://localhost:8080 # Trino/Presto
# Docker platforms
docker ps | grep -E 'trino|presto|clickhouse|firebolt'
Solutions:
Start the platform:
# Trino docker run -d -p 8080:8080 trinodb/trino # Firebolt Core docker run -d -p 3473:3473 ghcr.io/firebolt-db/firebolt-core:preview-rc # ClickHouse docker run -d -p 9000:9000 clickhouse/clickhouse-server
Check port availability:
lsof -i :8080 # Check if port is in use netstat -an | grep 8080
Verify host/port in config:
benchbox run --platform trino --benchmark tpch \ --platform-option host=localhost \ --platform-option port=8080
Network Timeout¶
Symptoms:
TimeoutError: Connection timed out
socket.timeout: timed out
Solutions:
Cloud platforms - check firewall:
# AWS Security Groups aws ec2 describe-security-groups --group-ids sg-xxx # Test connectivity nc -zv your-cluster.redshift.amazonaws.com 5439
Increase connection timeout:
benchbox run --platform snowflake --benchmark tpch \ --platform-option connect_timeout=60
Authentication Failures¶
Invalid Credentials¶
Symptoms:
AuthenticationError: Invalid credentials
401 Unauthorized
Access Denied
Platform-Specific Solutions:
Snowflake¶
# Verify credentials work
snowsql -a $SNOWFLAKE_ACCOUNT -u $SNOWFLAKE_USER
# Check account format (should be account_locator.region)
echo $SNOWFLAKE_ACCOUNT
# Correct: xy12345.us-east-1 or xy12345.us-east-1.aws
Databricks¶
# Test token validity
curl -H "Authorization: Bearer $DATABRICKS_TOKEN" \
https://your-workspace.cloud.databricks.com/api/2.0/clusters/list
# Regenerate token if expired (90 days default)
# User Settings > Developer > Access Tokens
BigQuery¶
# Test service account
gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS
gcloud auth application-default print-access-token
# Verify project access
gcloud projects describe $BIGQUERY_PROJECT
Redshift¶
# Test connection
psql -h $REDSHIFT_HOST -p 5439 -U $REDSHIFT_USER -d dev
# For IAM auth, verify role
aws sts get-caller-identity
Token Expired¶
Symptoms:
Token has expired
Session expired
Solutions:
Regenerate tokens:
Databricks: User Settings > Access Tokens > Generate New
Snowflake: Tokens don’t expire, check password
BigQuery:
gcloud auth application-default login
Use refresh tokens where supported:
# BigQuery - auto-refresh with ADC gcloud auth application-default login
Catalog Not Found (Presto/Trino)¶
Symptoms:
ConfigurationError: Catalog 'memory' not found
CatalogNotFoundError: Catalog does not exist
Cause: The memory catalog default rarely exists on production servers.
Solutions:
List available catalogs:
# Trino/Presto benchbox platforms check --platform trino \ --platform-option host=localhost
Specify the correct catalog:
benchbox run --platform trino --benchmark tpch \ --platform-option catalog=hive # Or iceberg, delta, etc.
Common catalog names:
hive- Hive Metastoreiceberg- Apache Icebergdelta- Delta Laketpch- TPC-H connector (built-in)mysql,postgresql- Database connectors
Permission Errors¶
Insufficient Privileges¶
Symptoms:
PermissionDenied: User does not have permission
AccessDenied: Access Denied
Solutions by Platform:
Snowflake¶
-- Grant required permissions
GRANT USAGE ON WAREHOUSE compute_wh TO ROLE benchbox_role;
GRANT CREATE DATABASE ON ACCOUNT TO ROLE benchbox_role;
GRANT USAGE ON DATABASE benchbox TO ROLE benchbox_role;
GRANT CREATE TABLE ON SCHEMA benchbox.public TO ROLE benchbox_role;
Databricks¶
-- Unity Catalog permissions
GRANT USE CATALOG ON CATALOG benchmarks TO `user@company.com`;
GRANT CREATE SCHEMA ON CATALOG benchmarks TO `user@company.com`;
GRANT USE SCHEMA ON SCHEMA benchmarks.default TO `user@company.com`;
BigQuery¶
# Grant via gcloud
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:user@company.com" \
--role="roles/bigquery.dataEditor"
Redshift¶
-- Grant schema access
GRANT CREATE ON DATABASE dev TO benchbox_user;
GRANT ALL ON SCHEMA public TO benchbox_user;
Query Timeouts¶
Symptoms:
QueryTimeoutError: Query exceeded time limit
Statement timeout
Solutions:
Increase query timeout:
# Global benchbox run --platform snowflake --benchmark tpch \ --platform-option query_timeout=3600 # 1 hour # Redshift benchbox run --platform redshift --benchmark tpch \ --platform-option statement_timeout=3600000 # ms
Use larger compute resources:
# Snowflake - larger warehouse benchbox run --platform snowflake --benchmark tpch --scale 10 \ --platform-option warehouse=LARGE_WH # Databricks - larger SQL warehouse benchbox run --platform databricks --benchmark tpch --scale 10 \ --platform-option http_path=/sql/1.0/warehouses/large_wh_id
Reduce scale factor for testing:
# Start small benchbox run --platform snowflake --benchmark tpch --scale 0.1
Memory Issues¶
Out of Memory¶
Symptoms:
MemoryError: Unable to allocate
OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
Solutions by Platform:
DuckDB¶
# Limit memory and enable spilling
benchbox run --platform duckdb --benchmark tpch --scale 10 \
--platform-option memory_limit=8GB \
--platform-option temp_directory=/fast/ssd/tmp
Polars¶
# Enable streaming for large datasets
benchbox run --platform polars-df --benchmark tpch --scale 10 \
--platform-option streaming=true
Spark¶
# Increase executor memory
benchbox run --platform spark --benchmark tpch --scale 10 \
--platform-option executor_memory=8g \
--platform-option driver_memory=4g
Cloud Platforms¶
# Use larger compute tiers
benchbox run --platform snowflake --benchmark tpch --scale 100 \
--platform-option warehouse=X_LARGE_WH
Scale Factor Recommendations¶
Platform |
Max Recommended SF |
Notes |
|---|---|---|
DuckDB |
10-100 |
Depends on RAM |
SQLite |
0.1-1.0 |
Not for OLAP |
Polars |
10-100 |
Enable streaming |
Snowflake |
1000+ |
Scale warehouse |
Databricks |
1000+ |
Scale cluster |
BigQuery |
1000+ |
Serverless |
Data Generation Issues¶
dbgen or dsdgen not found¶
Problem: TPC-H or TPC-DS data generation fails with an error indicating that dbgen or dsdgen is not found.
Solution:
BenchBox attempts to compile these tools automatically, but if that fails, you may need to compile them manually.
Navigate to the tools directory:
# For TPC-H cd _sources/tpc-h/dbgen # For TPC-DS cd _sources/tpc-ds/tools
Compile the tools:
make
If you encounter compilation errors, you may need to install a C compiler and other build tools (
build-essentialon Debian/Ubuntu, Xcode Command Line Tools on macOS).
Slow Data Generation¶
Problem: Data generation is taking a very long time.
Solution:
Use a smaller scale factor: For testing and development, use a small scale factor like
0.01.Run power-only cycles first:
benchbox run --phases generate,load,powerlets you warm caches before expanding to throughput tests.Check disk throughput: Write data to a fast local volume before copying it to network storage. Use the
--outputflag to point at SSD-backed paths.Persist generated data: Reuse existing datasets with
--forceturned off (default) so future runs skip regeneration.
Data Loading Issues¶
File Not Found¶
Symptoms:
FileNotFoundError: Data file not found
No such file or directory
Solutions:
Generate data first:
# Explicit generation benchbox run --platform duckdb --benchmark tpch --scale 0.1 \ --phases generate
Check data directory:
ls -la ~/.cache/benchbox/tpch/sf0.1/
Force regeneration:
benchbox run --platform duckdb --benchmark tpch --scale 0.1 \ --force datagen
Upload Failures¶
Symptoms:
UploadError: Failed to upload file
S3 upload failed
Storage access denied
Solutions:
Cloud Storage Staging¶
# Verify storage access
aws s3 ls s3://your-bucket/benchbox/
# Test write access
aws s3 cp test.txt s3://your-bucket/benchbox/
# Configure staging
benchbox run --platform redshift --benchmark tpch --scale 10 \
--staging-root s3://your-bucket/benchbox/
Snowflake Stages¶
# List stages
snowsql -q "SHOW STAGES;"
# Create user stage
snowsql -q "CREATE STAGE IF NOT EXISTS @~/benchbox_stage;"
Databricks Volumes¶
# Check volume permissions
databricks volumes list /Volumes/catalog/schema/
# Create volume
databricks volumes create catalog.schema.benchbox_data
Platform-Specific Issues¶
Snowflake¶
Warehouse Suspended¶
# Resume warehouse
snowsql -q "ALTER WAREHOUSE BENCHMARK_WH RESUME;"
# Set auto-resume
snowsql -q "ALTER WAREHOUSE BENCHMARK_WH SET AUTO_RESUME = TRUE;"
Databricks¶
Cluster Not Running¶
# Start cluster via API
curl -X POST "https://workspace.cloud.databricks.com/api/2.0/clusters/start" \
-H "Authorization: Bearer $DATABRICKS_TOKEN" \
-d '{"cluster_id": "your-cluster-id"}'
# Or use SQL Warehouse (always-on option available)
BigQuery¶
Quota Exceeded¶
# Check quotas
gcloud compute project-info describe --project $PROJECT_ID
# Request increase via Console
# BigQuery > Quotas > Request Increase
General Tips¶
Use the
--verboseflag: The-vor-vvflag can provide more detailed output to help you diagnose issues.Check the logs: BenchBox creates log files in the output directory. These can contain valuable information for troubleshooting.
Start small: When testing a new setup, start with a small scale factor (e.g.,
0.01) to quickly verify that everything is working correctly.
Getting Help¶
Diagnostic Information¶
When reporting issues, include:
# System info
benchbox --version
python --version
uname -a
# Platform availability
benchbox platforms list
# Full error with traceback
benchbox run --platform <platform> --benchmark tpch --scale 0.01 \
--verbose 2>&1 | tee benchmark_error.log
Resources¶
GitHub Issues - Report bugs
Platform Docs - Platform-specific guides
Configuration Guide - Detailed options