Dry Run Mode¶
The BenchBox dry run feature provides a powerful preview capability that allows you to validate configurations, extract queries, estimate resources, and troubleshoot issues without executing actual benchmarks.
Overview¶
Dry run mode is essential for:
Configuration Validation - Verify all parameters are set correctly
Query Preview - Review SQL queries before execution
Resource Planning - Understand memory and storage requirements
Debugging - Troubleshoot benchmark setup issues
Documentation - Generate examples and query references
Development - Test changes without running full benchmarks
Quick Start¶
Basic CLI Usage¶
# Preview TPC-H benchmark configuration and queries
benchbox run --dry-run ./benchmark_runs/dryrun_previews --platform duckdb --benchmark tpch --scale 0.1
# Preview with tuning configurations
benchbox run --dry-run ./tuned_preview --platform duckdb --benchmark tpcds --scale 0.01 --tuning
# Preview multiple output formats
benchbox run --dry-run ./systematic_preview --platform duckdb --benchmark primitives --scale 0.001
### Seed Control in Dry Run
You can control the RNG seed used for parameter generation in TPC Power/Throughput test previews. This is useful at very small scales where certain seeds may cause dsqgen/qgen to fail parameterization.
```bash
# TPC-DS Power Test preview with seed
benchbox run --dry-run ./preview_tpcds_seed7 --platform duckdb --benchmark tpcds --phases power --scale 0.01 --seed 7
# TPC-DS Throughput Test preview with base seed
benchbox run --dry-run ./preview_tpcds_thr9 --platform duckdb --benchmark tpcds --phases throughput --scale 0.01 --seed 9
# TPC-H Power Test preview with seed
benchbox run --dry-run ./preview_tpch_seed5 --platform duckdb --benchmark tpch --phases power --scale 0.01 --seed 5
If a specific seed cannot generate all queries at a tiny scale, the CLI preflight validation will fail fast and report example failures. Try a different seed or a slightly larger scale.
### Programmatic Usage
```python
from benchbox.cli.dryrun import DryRunExecutor
from benchbox.cli.system import SystemProfiler
from benchbox.cli.database import DatabaseConfig
from benchbox.cli.types import BenchmarkConfig
from pathlib import Path
# Setup dry run
output_dir = Path("./my_dry_run")
dry_run = DryRunExecutor(output_dir)
# Configure components
profiler = SystemProfiler()
system_profile = profiler.get_system_profile()
db_config = DatabaseConfig(
platform="duckdb",
connection_params={"database": ":memory:"},
dialect="duckdb"
)
benchmark_config = BenchmarkConfig(
name="tpch",
scale_factor=0.01,
platform="duckdb"
)
# Execute dry run
result = dry_run.execute_dry_run(
benchmark_config=benchmark_config,
system_profile=system_profile,
database_config=db_config
)
print(f"Extracted {len(result.queries)} queries")
print(f"Generated schema with {len(result.schema_info.get('tables', {}))} tables")
print(f"Memory estimate: {result.resource_estimates.estimated_memory_mb} MB")
Output Structure¶
When you run a dry run, BenchBox creates an output directory:
dry_run_output/
├── summary.json # Complete configuration summary
├── summary.yaml # Human-readable configuration
├── system_profile.json # System information
├── queries/ # Individual SQL query files
│ ├── query_1.sql
│ ├── query_2.sql
│ └── ...
├── schema.sql # Database schema definition
└── resource_estimates.json # Memory and performance estimates
Summary Files¶
summary.json - Complete structured output:
{
"timestamp": "2025-01-15T14:30:22.123456",
"benchmark_config": {
"name": "tpch",
"scale_factor": 0.1,
"platform": "duckdb"
},
"system_profile": {
"os": "Darwin 24.6.0",
"architecture": "arm64",
"memory_gb": 36.0,
"cpu_cores": 12
},
"database_config": {
"platform": "duckdb",
"dialect": "duckdb",
"connection": ":memory:"
},
"queries": {
"1": "SELECT ... FROM lineitem WHERE ...",
"2": "SELECT ... FROM supplier, nation ..."
},
"schema_info": {
"tables": {
"lineitem": {"row_count": 600572, "columns": 16},
"orders": {"row_count": 150000, "columns": 9}
}
},
"resource_estimates": {
"estimated_memory_mb": 256,
"estimated_storage_mb": 100,
"estimated_runtime_minutes": 3.2
}
}
summary.yaml - Human-readable format:
timestamp: 2025-01-15T14:30:22.123456
benchmark_config:
name: tpch
scale_factor: 0.1
platform: duckdb
system_profile:
os: Darwin 24.6.0
architecture: arm64
memory_gb: 36.0
cpu_cores: 12
# ... continues ...
Query Files¶
Individual SQL files are saved in the queries/ directory:
queries/query_1.sql:
-- TPC-H Query 1: Pricing Summary Report
-- This query reports the amount of business that was billed, shipped, and returned.
SELECT
l_returnflag,
l_linestatus,
SUM(l_quantity) AS sum_qty,
SUM(l_extendedprice) AS sum_base_price,
SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
AVG(l_quantity) AS avg_qty,
AVG(l_extendedprice) AS avg_price,
AVG(l_discount) AS avg_disc,
COUNT(*) AS count_order
FROM
lineitem
WHERE
l_shipdate <= DATE '1998-12-01' - INTERVAL '90' DAY
GROUP BY
l_returnflag,
l_linestatus
ORDER BY
l_returnflag,
l_linestatus;
Schema File¶
schema.sql - Complete database schema:
-- TPC-H Database Schema
-- Generated by BenchBox Dry Run
CREATE TABLE nation (
n_nationkey INTEGER,
n_name CHAR(25),
n_regionkey INTEGER,
n_comment VARCHAR(152)
);
CREATE TABLE region (
r_regionkey INTEGER,
r_name CHAR(25),
r_comment VARCHAR(152)
);
-- ... continues with all tables ...
-- Indexes (if tuning enabled)
CREATE INDEX idx_lineitem_shipdate ON lineitem(l_shipdate);
CREATE INDEX idx_orders_orderdate ON orders(o_orderdate);
-- ... continues ...
Console Output¶
The dry run provides rich, formatted console output:
$ benchbox run --dry-run ./preview --platform duckdb --benchmark tpch --scale 0.1
DRY RUN MODE - Preview Only (No Execution)
Output Directory: ./preview
┌─ System Profile ─┐
│ Component │ Value │
├───────────────┼──────────────────────────┤
│ OS │ Darwin 24.6.0 │
│ Architecture │ arm64 │
│ CPU Cores │ 12 physical, 12 logical │
│ Memory │ 36.0 GB total │
│ Python │ 3.13.5 │
└───────────────┴──────────────────────────┘
┌─ Database Configuration ─┐
│ Parameter │ Value │
├───────────────┼──────────────────────────┤
│ Platform │ duckdb │
│ Connection │ :memory: │
│ Dialect │ duckdb │
│ OLAP Support │ ✅ │
└───────────────┴──────────────────────────┘
┌─ Benchmark Configuration ─┐
│ Parameter │ Value │
├────────────────────┼──────────────────────────┤
│ Name │ TPC-H │
│ Scale Factor │ 0.1 │
│ Estimated Data │ 100 MB │
│ Query Count │ 22 │
│ Complexity │ Medium │
└────────────────────┴──────────────────────────┘
Extracted 22 queries:
✅ Query 1: Pricing Summary Report Query
✅ Query 2: Minimum Cost Supplier Query
✅ Query 3: Shipping Priority Query
✅ Query 4: Order Priority Checking Query
✅ Query 5: Local Supplier Volume Query
... (truncated)
Generated schema with 8 tables:
✅ customer (150,000 rows, 8 columns)
✅ lineitem (600,572 rows, 16 columns)
✅ nation (25 rows, 4 columns)
✅ orders (150,000 rows, 9 columns)
✅ part (20,000 rows, 9 columns)
✅ partsupp (80,000 rows, 5 columns)
✅ region (5 rows, 3 columns)
✅ supplier (1,000 rows, 7 columns)
Resource Estimates:
• Memory Required: ~256 MB
• Storage Required: ~100 MB
• Estimated Runtime: 2-5 minutes
• Queries with Joins: 15/22
• Complex Queries: 8/22
Saved complete preview to: ./preview/
├── summary.json (configuration details)
├── summary.yaml (human-readable config)
├── queries/ (22 SQL files)
├── schema.sql (table definitions)
└── resource_estimates.json
Features¶
Tuning Preview¶
When using --tuning, the dry run includes tuning configuration details:
benchbox run --dry-run ./tuned_preview \
--platform duckdb \
--benchmark tpcds \
--scale 0.1 \
--tuning
Additional output includes:
┌─ Tuning Configuration ─┐
│ Parameter │ Value │
├────────────────────────┼───────────┤
│ Tuning Enabled │ ✅ │
│ Primary Keys │ ✅ │
│ Foreign Keys │ ✅ │
│ Partitioning │ Applied │
│ Clustering │ Applied │
│ Distribution Keys │ Applied │
│ Sort Keys │ Applied │
└────────────────────────┴───────────┘
Applied Tunings:
✅ lineitem: PARTITION BY (l_shipdate), CLUSTER BY (l_orderkey)
✅ orders: PARTITION BY (o_orderdate), CLUSTER BY (o_custkey)
✅ customer: DISTRIBUTE BY (c_nationkey)
✅ supplier: DISTRIBUTE BY (s_nationkey)
Cross-Platform Preview¶
Compare configurations across multiple platforms:
# Preview for multiple platforms
for platform in duckdb clickhouse databricks; do
benchbox run --dry-run ./preview_${platform} \
--platform ${platform} \
--benchmark tpch \
--scale 0.01
done
# Compare extracted queries
diff ./preview_duckdb/queries/query_1.sql ./preview_clickhouse/queries/query_1.sql
Resource Scaling Analysis¶
Test different scale factors to understand resource requirements:
# Test multiple scale factors
for scale in 0.001 0.01 0.1 1.0; do
benchbox run --dry-run ./scale_${scale} \
--platform duckdb \
--benchmark tpch \
--scale ${scale}
# Extract memory estimates
cat ./scale_${scale}/resource_estimates.json | jq '.estimated_memory_mb'
done
Integration Patterns¶
Pre-commit Validation¶
Create a git pre-commit hook for benchmark validation:
#!/bin/bash
# .git/hooks/pre-commit
# Validate benchmark changes with dry run
if git diff --cached --name-only | grep -q "benchbox/"; then
echo "Running benchmark validation..."
# Test critical benchmarks
benchbox run --dry-run /tmp/validation_tpch \
--platform duckdb --benchmark tpch --scale 0.001
benchbox run --dry-run /tmp/validation_primitives \
--platform duckdb --benchmark primitives --scale 0.001
if [ $? -eq 0 ]; then
echo "✅ Benchmark validation passed"
rm -rf /tmp/validation_*
else
echo "❌ Benchmark validation failed"
exit 1
fi
fi
Documentation Generation¶
Generate benchmark documentation from dry run output:
# Generate query documentation
benchbox run --dry-run ./docs/tpcds_queries \
--platform duckdb \
--benchmark tpcds \
--scale 0.01
# Create markdown documentation
cat > ./docs/tpcds_benchmark.md << EOF
# TPC-DS Benchmark
Generated from dry run analysis on $(date).
## Queries
EOF
# Add query documentation
for query in ./docs/tpcds_queries/queries/*.sql; do
echo "### $(basename $query .sql)" >> ./docs/tpcds_benchmark.md
echo '```sql' >> ./docs/tpcds_benchmark.md
head -20 "$query" >> ./docs/tpcds_benchmark.md
echo '```' >> ./docs/tpcds_benchmark.md
echo >> ./docs/tpcds_benchmark.md
done
Configuration Testing¶
Test configurations across environments:
#!/bin/bash
# test_configurations.sh
configurations=(
"duckdb:tpch:0.01"
"duckdb:tpcds:0.01"
"duckdb:primitives:0.001"
"sqlite:primitives:0.001"
)
for config in "${configurations[@]}"; do
IFS=':' read -r database benchmark scale <<< "$config"
echo "Testing $database + $benchmark (scale $scale)..."
if benchbox run --dry-run "/tmp/test_${database}_${benchmark}" \
--platform "$database" \
--benchmark "$benchmark" \
--scale "$scale"; then
echo "✅ Configuration valid"
else
echo "❌ Configuration failed"
fi
rm -rf "/tmp/test_${database}_${benchmark}"
done
Query Validation¶
Validate extracted queries with external tools:
import subprocess
from pathlib import Path
def validate_queries(dry_run_dir: str, dialect: str = "duckdb"):
"""Validate extracted SQL queries."""
queries_dir = Path(dry_run_dir) / "queries"
if not queries_dir.exists():
return
print(f"Validating queries for {dialect} dialect...")
for query_file in queries_dir.glob("*.sql"):
print(f" Checking {query_file.name}...", end="")
# Basic validation
with open(query_file, 'r') as f:
content = f.read()
# Check for common issues
issues = []
if not content.strip():
issues.append("Empty query")
if content.upper().count('SELECT') == 0:
issues.append("No SELECT statement")
if content.count('(') != content.count(')'):
issues.append("Unbalanced parentheses")
if not issues:
print(" ✅")
else:
print(f" ❌ ({', '.join(issues)})")
# Usage
validate_queries("./my_dry_run", "duckdb")
Troubleshooting¶
Common Issues¶
“No queries extracted”
Check if benchmark name is correct
Verify benchmark supports the specified scale factor
Try a different scale factor (some benchmarks have minimum scales)
“Schema generation failed”
Ensure database platform is supported
Check if benchmark has schema definitions
Try without tuning options first
“Resource estimates unavailable”
Some benchmarks don’t support resource estimation
Check if benchmark has proper metadata
Estimates are approximate and may not be available for all benchmarks
Debugging with Dry Run¶
Configuration Issues:
# Test configuration without execution
benchbox run --dry-run ./debug \
--platform problematic_db \
--benchmark complex_benchmark \
--scale 1.0
# Check the output files for issues
cat ./debug/summary.json | jq '.errors // "No errors"'
Query Issues:
# Preview queries for syntax validation
benchbox run --dry-run ./query_check \
--platform target_platform \
--benchmark tpch
# Validate SQL syntax externally
for query in ./query_check/queries/*.sql; do
echo "Validating $query..."
# Use your preferred SQL validator
sqlfluff lint "$query" --dialect duckdb
done
Platform Compatibility:
# Test platform-specific configurations
benchbox run --dry-run ./platform_test \
--platform clickhouse \
--benchmark tpch \
--scale 0.01
# Check for platform-specific SQL differences
diff ./platform_test/queries/query_1.sql ./reference/duckdb_query_1.sql
Best Practices¶
Development Workflow¶
Start with Dry Run - Always use dry run to validate configuration before executing
Test Scale Factors - Use small scale factors (0.001-0.01) for development
Validate Queries - Check extracted queries for syntax and logic issues
Resource Planning - Use estimates to plan execution environment
Cross-Platform Testing - Test configurations across target platforms
Performance Optimization¶
Resource Estimates - Use memory and storage estimates to right-size environments
Query Analysis - Identify complex queries that may need optimization
Schema Tuning - Review tuning configurations before applying to production
Scaling Validation - Test resource scaling with different scale factors
Production Deployment¶
Configuration Validation - Validate all configurations in staging environment
Resource Provisioning - Use dry run estimates for capacity planning
Query Review - Review all queries for security and performance
Documentation - Generate documentation from dry run output
Monitoring Setup - Use estimates to configure monitoring thresholds
API Reference¶
DryRunExecutor¶
Primary class for executing dry runs.
class DryRunExecutor:
def __init__(self, output_dir: Optional[Path] = None):
"""Initialize dry run executor.
Args:
output_dir: Directory to save dry run output
"""
def execute_dry_run(
self,
benchmark_config: BenchmarkConfig,
system_profile: SystemProfile,
database_config: DatabaseConfig
) -> DryRunResult:
"""Execute dry run.
Returns:
DryRunResult with all preview information
"""
DryRunResult¶
Result object containing all dry run information.
@dataclass
class DryRunResult:
timestamp: datetime
benchmark_config: Dict[str, Any]
system_profile: Dict[str, Any]
database_config: Dict[str, Any]
queries: Dict[str, str]
schema_info: Dict[str, Any]
resource_estimates: ResourceEstimates
tuning_config: Optional[Dict[str, Any]]
ResourceEstimates¶
Resource requirement estimates.
@dataclass
class ResourceEstimates:
estimated_memory_mb: int
estimated_storage_mb: int
estimated_runtime_minutes: float
complexity_score: int
join_count: int
subquery_count: int
The dry run feature is a powerful tool for development, testing, and production planning. Use it extensively to validate configurations, understand resource requirements, and ensure successful benchmark execution.