Dry Run Mode¶

Tags intermediate guide cli

The BenchBox dry run feature provides a powerful preview capability that allows you to validate configurations, extract queries, estimate resources, and troubleshoot issues without executing actual benchmarks.

Overview¶

Dry run mode is essential for:

Configuration Validation - Verify all parameters are set correctly
Query Preview - Review SQL queries before execution
Resource Planning - Understand memory and storage requirements
Debugging - Troubleshoot benchmark setup issues
Documentation - Generate examples and query references
Development - Test changes without running full benchmarks

Quick Start¶

Basic CLI Usage¶

# Preview TPC-H benchmark configuration and queries
benchbox run --dry-run ./benchmark_runs/dryrun_previews --platform duckdb --benchmark tpch --scale 0.1

# Preview with tuning configurations
benchbox run --dry-run ./tuned_preview --platform duckdb --benchmark tpcds --scale 0.01 --tuning

# Preview multiple output formats
benchbox run --dry-run ./systematic_preview --platform duckdb --benchmark primitives --scale 0.001

### Seed Control in Dry Run

You can control the RNG seed used for parameter generation in TPC Power/Throughput test previews. This is useful at very small scales where certain seeds may cause dsqgen/qgen to fail parameterization.

```bash
# TPC-DS Power Test preview with seed
benchbox run --dry-run ./preview_tpcds_seed7 --platform duckdb --benchmark tpcds --phases power --scale 0.01 --seed 7

# TPC-DS Throughput Test preview with base seed
benchbox run --dry-run ./preview_tpcds_thr9 --platform duckdb --benchmark tpcds --phases throughput --scale 0.01 --seed 9

# TPC-H Power Test preview with seed
benchbox run --dry-run ./preview_tpch_seed5 --platform duckdb --benchmark tpch --phases power --scale 0.01 --seed 5

If a specific seed cannot generate all queries at a tiny scale, the CLI preflight validation will fail fast and report example failures. Try a different seed or a slightly larger scale.

### Programmatic Usage

```python
from benchbox.cli.dryrun import DryRunExecutor
from benchbox.cli.system import SystemProfiler
from benchbox.cli.database import DatabaseConfig
from benchbox.cli.types import BenchmarkConfig
from pathlib import Path

# Setup dry run
output_dir = Path("./my_dry_run")
dry_run = DryRunExecutor(output_dir)

# Configure components
profiler = SystemProfiler()
system_profile = profiler.get_system_profile()

db_config = DatabaseConfig(
    platform="duckdb",
    connection_params={"database": ":memory:"},
    dialect="duckdb"
)

benchmark_config = BenchmarkConfig(
    name="tpch",
    scale_factor=0.01,
    platform="duckdb"
)

# Execute dry run
result = dry_run.execute_dry_run(
    benchmark_config=benchmark_config,
    system_profile=system_profile,
    database_config=db_config
)

print(f"Extracted {len(result.queries)} queries")
print(f"Generated schema with {len(result.schema_info.get('tables', {}))} tables")
print(f"Memory estimate: {result.resource_estimates.estimated_memory_mb} MB")

Output Structure¶

When you run a dry run, BenchBox creates an output directory:

dry_run_output/
├── summary.json              # Complete configuration summary
├── summary.yaml              # Human-readable configuration
├── system_profile.json       # System information
├── queries/                  # Individual SQL query files
│   ├── query_1.sql
│   ├── query_2.sql
│   └── ...
├── schema.sql                # Database schema definition
└── resource_estimates.json   # Memory and performance estimates

Summary Files¶

summary.json - Complete structured output:

{
  "timestamp": "2025-01-15T14:30:22.123456",
  "benchmark_config": {
    "name": "tpch",
    "scale_factor": 0.1,
    "platform": "duckdb"
  },
  "system_profile": {
    "os": "Darwin 24.6.0",
    "architecture": "arm64",
    "memory_gb": 36.0,
    "cpu_cores": 12
  },
  "database_config": {
    "platform": "duckdb",
    "dialect": "duckdb",
    "connection": ":memory:"
  },
  "queries": {
    "1": "SELECT ... FROM lineitem WHERE ...",
    "2": "SELECT ... FROM supplier, nation ..."
  },
  "schema_info": {
    "tables": {
      "lineitem": {"row_count": 600572, "columns": 16},
      "orders": {"row_count": 150000, "columns": 9}
    }
  },
  "resource_estimates": {
    "estimated_memory_mb": 256,
    "estimated_storage_mb": 100,
    "estimated_runtime_minutes": 3.2
  }
}

summary.yaml - Human-readable format:

timestamp: 2025-01-15T14:30:22.123456
benchmark_config:
  name: tpch
  scale_factor: 0.1
  platform: duckdb
system_profile:
  os: Darwin 24.6.0
  architecture: arm64
  memory_gb: 36.0
  cpu_cores: 12
# ... continues ...

Query Files¶

Individual SQL files are saved in the queries/ directory:

queries/query_1.sql:

-- TPC-H Query 1: Pricing Summary Report
-- This query reports the amount of business that was billed, shipped, and returned.

SELECT
    l_returnflag,
    l_linestatus,
    SUM(l_quantity) AS sum_qty,
    SUM(l_extendedprice) AS sum_base_price,
    SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    AVG(l_quantity) AS avg_qty,
    AVG(l_extendedprice) AS avg_price,
    AVG(l_discount) AS avg_disc,
    COUNT(*) AS count_order
FROM
    lineitem
WHERE
    l_shipdate <= DATE '1998-12-01' - INTERVAL '90' DAY
GROUP BY
    l_returnflag,
    l_linestatus
ORDER BY
    l_returnflag,
    l_linestatus;

Schema File¶

schema.sql - Complete database schema:

-- TPC-H Database Schema
-- Generated by BenchBox Dry Run

CREATE TABLE nation (
    n_nationkey  INTEGER,
    n_name       CHAR(25),
    n_regionkey  INTEGER,
    n_comment    VARCHAR(152)
);

CREATE TABLE region (
    r_regionkey  INTEGER,
    r_name       CHAR(25),
    r_comment    VARCHAR(152)
);

-- ... continues with all tables ...

-- Indexes (if tuning enabled)
CREATE INDEX idx_lineitem_shipdate ON lineitem(l_shipdate);
CREATE INDEX idx_orders_orderdate ON orders(o_orderdate);
-- ... continues ...

Console Output¶

The dry run provides rich, formatted console output:

$ benchbox run --dry-run ./preview --platform duckdb --benchmark tpch --scale 0.1

DRY RUN MODE - Preview Only (No Execution)
Output Directory: ./preview

┌─ System Profile ─┐
│ Component     │ Value                    │
├───────────────┼──────────────────────────┤
│ OS            │ Darwin 24.6.0           │
│ Architecture  │ arm64                   │
│ CPU Cores     │ 12 physical, 12 logical │
│ Memory        │ 36.0 GB total           │
│ Python        │ 3.13.5                  │
└───────────────┴──────────────────────────┘

┌─ Database Configuration ─┐
│ Parameter     │ Value                    │
├───────────────┼──────────────────────────┤
│ Platform      │ duckdb                   │
│ Connection    │ :memory:                 │
│ Dialect       │ duckdb                   │
│ OLAP Support  │ ✅                        │
└───────────────┴──────────────────────────┘

┌─ Benchmark Configuration ─┐
│ Parameter          │ Value                    │
├────────────────────┼──────────────────────────┤
│ Name              │ TPC-H                    │
│ Scale Factor      │ 0.1                      │
│ Estimated Data    │ 100 MB                   │
│ Query Count       │ 22                       │
│ Complexity        │ Medium                   │
└────────────────────┴──────────────────────────┘

 Extracted 22 queries:
  ✅ Query 1: Pricing Summary Report Query
  ✅ Query 2: Minimum Cost Supplier Query
  ✅ Query 3: Shipping Priority Query
  ✅ Query 4: Order Priority Checking Query
  ✅ Query 5: Local Supplier Volume Query
  ... (truncated)

 Generated schema with 8 tables:
  ✅ customer (150,000 rows, 8 columns)
  ✅ lineitem (600,572 rows, 16 columns)
  ✅ nation (25 rows, 4 columns)
  ✅ orders (150,000 rows, 9 columns)
  ✅ part (20,000 rows, 9 columns)
  ✅ partsupp (80,000 rows, 5 columns)
  ✅ region (5 rows, 3 columns)
  ✅ supplier (1,000 rows, 7 columns)

 Resource Estimates:
  • Memory Required: ~256 MB
  • Storage Required: ~100 MB
  • Estimated Runtime: 2-5 minutes
  • Queries with Joins: 15/22
  • Complex Queries: 8/22

 Saved complete preview to: ./preview/
  ├── summary.json (configuration details)
  ├── summary.yaml (human-readable config)
  ├── queries/ (22 SQL files)
  ├── schema.sql (table definitions)
  └── resource_estimates.json

Features¶

Tuning Preview¶

When using --tuning, the dry run includes tuning configuration details:

benchbox run --dry-run ./tuned_preview \
  --platform duckdb \
  --benchmark tpcds \
  --scale 0.1 \
  --tuning

Additional output includes:

┌─ Tuning Configuration ─┐
│ Parameter              │ Value     │
├────────────────────────┼───────────┤
│ Tuning Enabled         │ ✅         │
│ Primary Keys           │ ✅         │
│ Foreign Keys           │ ✅         │
│ Partitioning           │ Applied   │
│ Clustering             │ Applied   │
│ Distribution Keys      │ Applied   │
│ Sort Keys              │ Applied   │
└────────────────────────┴───────────┘

 Applied Tunings:
  ✅ lineitem: PARTITION BY (l_shipdate), CLUSTER BY (l_orderkey)
  ✅ orders: PARTITION BY (o_orderdate), CLUSTER BY (o_custkey)
  ✅ customer: DISTRIBUTE BY (c_nationkey)
  ✅ supplier: DISTRIBUTE BY (s_nationkey)

Cross-Platform Preview¶

Compare configurations across multiple platforms:

# Preview for multiple platforms
for platform in duckdb clickhouse databricks; do
  benchbox run --dry-run ./preview_${platform} \
    --platform ${platform} \
    --benchmark tpch \
    --scale 0.01
done

# Compare extracted queries
diff ./preview_duckdb/queries/query_1.sql ./preview_clickhouse/queries/query_1.sql

Resource Scaling Analysis¶

Test different scale factors to understand resource requirements:

# Test multiple scale factors
for scale in 0.001 0.01 0.1 1.0; do
  benchbox run --dry-run ./scale_${scale} \
    --platform duckdb \
    --benchmark tpch \
    --scale ${scale}

  # Extract memory estimates
  cat ./scale_${scale}/resource_estimates.json | jq '.estimated_memory_mb'
done

Integration Patterns¶

Pre-commit Validation¶

Create a git pre-commit hook for benchmark validation:

#!/bin/bash
# .git/hooks/pre-commit
# Validate benchmark changes with dry run

if git diff --cached --name-only | grep -q "benchbox/"; then
  echo "Running benchmark validation..."

  # Test critical benchmarks
  benchbox run --dry-run /tmp/validation_tpch \
    --platform duckdb --benchmark tpch --scale 0.001

  benchbox run --dry-run /tmp/validation_primitives \
    --platform duckdb --benchmark primitives --scale 0.001

  if [ $? -eq 0 ]; then
    echo "✅ Benchmark validation passed"
    rm -rf /tmp/validation_*
  else
    echo "❌ Benchmark validation failed"
    exit 1
  fi
fi

Documentation Generation¶

Generate benchmark documentation from dry run output:

# Generate query documentation
benchbox run --dry-run ./docs/tpcds_queries \
  --platform duckdb \
  --benchmark tpcds \
  --scale 0.01

# Create markdown documentation
cat > ./docs/tpcds_benchmark.md << EOF
# TPC-DS Benchmark

Generated from dry run analysis on $(date).

## Queries
EOF

# Add query documentation
for query in ./docs/tpcds_queries/queries/*.sql; do
  echo "### $(basename $query .sql)" >> ./docs/tpcds_benchmark.md
  echo '```sql' >> ./docs/tpcds_benchmark.md
  head -20 "$query" >> ./docs/tpcds_benchmark.md
  echo '```' >> ./docs/tpcds_benchmark.md
  echo >> ./docs/tpcds_benchmark.md
done

Configuration Testing¶

Test configurations across environments:

#!/bin/bash
# test_configurations.sh

configurations=(
  "duckdb:tpch:0.01"
  "duckdb:tpcds:0.01"
  "duckdb:primitives:0.001"
  "sqlite:primitives:0.001"
)

for config in "${configurations[@]}"; do
  IFS=':' read -r database benchmark scale <<< "$config"

  echo "Testing $database + $benchmark (scale $scale)..."

  if benchbox run --dry-run "/tmp/test_${database}_${benchmark}" \
    --platform "$database" \
    --benchmark "$benchmark" \
    --scale "$scale"; then
    echo "✅ Configuration valid"
  else
    echo "❌ Configuration failed"
  fi

  rm -rf "/tmp/test_${database}_${benchmark}"
done

Query Validation¶

Validate extracted queries with external tools:

import subprocess
from pathlib import Path

def validate_queries(dry_run_dir: str, dialect: str = "duckdb"):
    """Validate extracted SQL queries."""

    queries_dir = Path(dry_run_dir) / "queries"
    if not queries_dir.exists():
        return

    print(f"Validating queries for {dialect} dialect...")

    for query_file in queries_dir.glob("*.sql"):
        print(f"  Checking {query_file.name}...", end="")

        # Basic validation
        with open(query_file, 'r') as f:
            content = f.read()

        # Check for common issues
        issues = []
        if not content.strip():
            issues.append("Empty query")
        if content.upper().count('SELECT') == 0:
            issues.append("No SELECT statement")
        if content.count('(') != content.count(')'):
            issues.append("Unbalanced parentheses")

        if not issues:
            print(" ✅")
        else:
            print(f" ❌ ({', '.join(issues)})")

# Usage
validate_queries("./my_dry_run", "duckdb")

Troubleshooting¶

Common Issues¶

“No queries extracted”

Check if benchmark name is correct
Verify benchmark supports the specified scale factor
Try a different scale factor (some benchmarks have minimum scales)

“Schema generation failed”

Ensure database platform is supported
Check if benchmark has schema definitions
Try without tuning options first

“Resource estimates unavailable”

Some benchmarks don’t support resource estimation
Check if benchmark has proper metadata
Estimates are approximate and may not be available for all benchmarks

Debugging with Dry Run¶

Configuration Issues:

# Test configuration without execution
benchbox run --dry-run ./debug \
  --platform problematic_db \
  --benchmark complex_benchmark \
  --scale 1.0

# Check the output files for issues
cat ./debug/summary.json | jq '.errors // "No errors"'

Query Issues:

# Preview queries for syntax validation
benchbox run --dry-run ./query_check \
  --platform target_platform \
  --benchmark tpch

# Validate SQL syntax externally
for query in ./query_check/queries/*.sql; do
  echo "Validating $query..."
  # Use your preferred SQL validator
  sqlfluff lint "$query" --dialect duckdb
done

Platform Compatibility:

# Test platform-specific configurations
benchbox run --dry-run ./platform_test \
  --platform clickhouse \
  --benchmark tpch \
  --scale 0.01

# Check for platform-specific SQL differences
diff ./platform_test/queries/query_1.sql ./reference/duckdb_query_1.sql

Best Practices¶

Development Workflow¶

Start with Dry Run - Always use dry run to validate configuration before executing
Test Scale Factors - Use small scale factors (0.001-0.01) for development
Validate Queries - Check extracted queries for syntax and logic issues
Resource Planning - Use estimates to plan execution environment
Cross-Platform Testing - Test configurations across target platforms

Performance Optimization¶

Resource Estimates - Use memory and storage estimates to right-size environments
Query Analysis - Identify complex queries that may need optimization
Schema Tuning - Review tuning configurations before applying to production
Scaling Validation - Test resource scaling with different scale factors

Production Deployment¶

Configuration Validation - Validate all configurations in staging environment
Resource Provisioning - Use dry run estimates for capacity planning
Query Review - Review all queries for security and performance
Documentation - Generate documentation from dry run output
Monitoring Setup - Use estimates to configure monitoring thresholds

API Reference¶

DryRunExecutor¶

Primary class for executing dry runs.

class DryRunExecutor:
    def __init__(self, output_dir: Optional[Path] = None):
        """Initialize dry run executor.

        Args:
            output_dir: Directory to save dry run output
        """

    def execute_dry_run(
        self,
        benchmark_config: BenchmarkConfig,
        system_profile: SystemProfile,
        database_config: DatabaseConfig
    ) -> DryRunResult:
        """Execute dry run.

        Returns:
            DryRunResult with all preview information
        """

DryRunResult¶

Result object containing all dry run information.

@dataclass
class DryRunResult:
    timestamp: datetime
    benchmark_config: Dict[str, Any]
    system_profile: Dict[str, Any]
    database_config: Dict[str, Any]
    queries: Dict[str, str]
    schema_info: Dict[str, Any]
    resource_estimates: ResourceEstimates
    tuning_config: Optional[Dict[str, Any]]

ResourceEstimates¶

Resource requirement estimates.

@dataclass
class ResourceEstimates:
    estimated_memory_mb: int
    estimated_storage_mb: int
    estimated_runtime_minutes: float
    complexity_score: int
    join_count: int
    subquery_count: int

The dry run feature is a powerful tool for development, testing, and production planning. Use it extensively to validate configurations, understand resource requirements, and ensure successful benchmark execution.