BenchBox Architecture

Tags concept intermediate python-api

High-level overview of BenchBox’s design, components, and extension points.

Design Philosophy

BenchBox is built on three core principles:

  1. Separation of Concerns: Benchmark logic, platform adapters, and query execution are independent layers

  2. Extensibility: New benchmarks and platforms can be added without modifying core code

  3. Standards Compliance: TPC benchmarks follow official specifications for reproducibility

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                       CLI / Python API                       │
│                    (User Entry Points)                       │
└──────────────────┬──────────────────────────────────────────┘
                   │
┌──────────────────┴──────────────────────────────────────────┐
│                   Orchestration Layer                        │
│  ┌──────────────────────────────────────────────────────┐  │
│  │  Run Configuration   │   Query Selection             │  │
│  │  Parameter Generation│   Result Collection           │  │
│  └──────────────────────────────────────────────────────┘  │
└──────────────────┬──────────────────────────────────────────┘
                   │
     ┌─────────────┼─────────────┐
     │             │             │
┌────▼────┐  ┌────▼────┐  ┌────▼────┐
│Benchmark│  │Platform │  │ Results │
│  Layer  │  │ Adapter │  │  Model  │
└─────────┘  └─────────┘  └─────────┘
     │             │             │
┌────▼────────────▼─────────────▼────┐
│       Database Connection          │
│     (Platform-Specific Driver)     │
└────────────────────────────────────┘

Core Components

1. Benchmark Layer

Location: benchbox/core/{benchmark}/

Responsibility: Encapsulates benchmark-specific logic

Key Classes:

  • BaseBenchmark: Abstract base class all benchmarks inherit from

  • {Benchmark}: Implementation class (e.g., TPCH, TPCDS)

  • {Benchmark}Generator: Data generation logic

  • {Benchmark}Queries: Query templates and parameterization

Example:

from benchbox import TPCH

# Benchmark knows:
# - How to generate data (dbgen invocation)
# - How to retrieve queries (with parameter substitution)
# - Schema definitions
# - Validation rules

See: Custom Benchmarks Guide

2. Platform Adapter Layer

Location: benchbox/platforms/{platform}/

Responsibility: Abstracts database-specific connection and execution logic

Key Classes:

  • {Platform}Adapter: Main adapter class (e.g., DuckDBAdapter)

  • {Platform}Connection: Connection wrapper implementing DatabaseConnection

  • Platform-specific optimizations (bulk loading, query hints)

Example:

from benchbox.platforms.duckdb import DuckDBAdapter

# Adapter handles:
# - Connection management
# - Data loading strategies (COPY, INSERT, external tables)
# - Query execution and error handling
# - Result collection and formatting

Supported Adapters:

  • DuckDB (embedded, local files)

  • ClickHouse (native protocol)

  • Databricks (SQL Warehouse, Unity Catalog)

  • BigQuery (serverless, cloud storage)

  • Snowflake (warehouse, stages)

  • Redshift (serverless, provisioned)

  • SQLite (testing, minimal datasets)

See: Platform Selection Guide

3. Results Model

Location: benchbox/core/results/

Responsibility: Structured representation of benchmark execution results

Key Classes:

  • BenchmarkResults: Main result object with timing, metadata, validation

  • QueryResult: Individual query execution details

  • ExecutionPhases: Setup, power test, throughput test phases

  • ValidationResult: Data quality and correctness checks

Schema:

{
    "benchmark_name": "TPC-H",
    "platform": "DuckDB",
    "scale_factor": 1.0,
    "execution_id": "tpch_1234567890",
    "timestamp": "2025-10-12T10:30:00Z",
    "total_execution_time": 45.2,
    "query_results": [
        {
            "query_id": "q1",
            "execution_time": 2.1,
            "status": "SUCCESS",
            "row_count": 4
        }
    ],
    "execution_phases": { ... },
    "validation_status": "PASSED"
}

See: Result Schema Reference

4. Connection Abstraction

Location: benchbox/core/connection.py

Responsibility: Unified interface for database operations

Key Interface:

class DatabaseConnection(ABC):
    @abstractmethod
    def execute(self, query: str) -> Any:
        """Execute query, return cursor/result"""

    @abstractmethod
    def fetchall(self, cursor) -> list:
        """Fetch all results from cursor"""

    @abstractmethod
    def close(self) -> None:
        """Close connection"""

All platform adapters implement this interface, enabling benchmark code to remain platform-agnostic.

5. Data Generation

Location: benchbox/core/{benchmark}/generator.py

Responsibility: Create synthetic benchmark data

Mechanisms:

  • TPC Benchmarks: Invoke official C binaries (dbgen, dsdgen, datagen)

  • Custom Benchmarks: Python-based generation using Faker, NumPy, Pandas

  • Output Formats: Parquet (default), CSV, JSON

Example:

# TPC-H uses official dbgen binary
generator = TPCHGenerator(scale_factor=1.0, output_dir="./data")
file_paths = generator.generate()  # Returns list of .parquet files

# Custom benchmarks use Python
generator = CoffeeShopGenerator(scale_factor=0.001)
file_paths = generator.generate()  # Generates with Faker

See: Data Generation Guide

6. CLI Orchestration

Location: benchbox/cli/

Responsibility: Command-line interface and workflow orchestration

Commands:

# Run benchmark end-to-end
benchbox run --benchmark tpch --platform duckdb --scale 1

# Generate data only
benchbox datagen --benchmark tpcds --scale 0.1

# Dry run (preview queries)
benchbox run --benchmark tpch --dry-run ./output

# Check dependencies
benchbox check-deps --matrix

Orchestrator Flow:

  1. Parse CLI arguments and config files

  2. Validate platform dependencies

  3. Initialize benchmark and platform adapter

  4. Execute benchmark phases (setup, queries, validation)

  5. Collect and save results

  6. Generate reports

See: CLI Quick Start

Data Flow

End-to-End Execution Flow

1. User Command
   ├── benchbox run --benchmark tpch --platform duckdb --scale 1
   └── Parsed by CLI → creates RunConfiguration

2. Benchmark Initialization
   ├── TPCH(scale_factor=1.0)
   ├── Checks if data exists at output_dir
   └── generate_data() if needed

3. Platform Adapter Setup
   ├── DuckDBAdapter()
   ├── Establishes database connection
   └── Creates schema (CREATE TABLE statements)

4. Data Loading Phase
   ├── Platform-specific bulk load (COPY FROM, external tables)
   ├── Validates row counts
   └── Creates indexes/constraints

5. Query Execution Phase
   ├── For each query in benchmark:
   │   ├── Get query text with parameters
   │   ├── Execute via platform adapter
   │   ├── Measure execution time
   │   └── Collect results
   └── Aggregate timing statistics

6. Results Collection
   ├── Create BenchmarkResults object
   ├── Include execution metadata (platform info, system profile)
   ├── Validate results (if validation rules exist)
   └── Save to JSON file

7. Output
   ├── Print summary to console
   ├── Save detailed JSON to output_dir
   └── Optional: Upload to cloud storage

Extension Points

Adding a New Benchmark

  1. Create benchbox/core/{benchmark}/ directory

  2. Implement classes:

    • {Benchmark}BenchmarkImpl(BaseBenchmark)

    • {Benchmark}Generator

    • {Benchmark}Queries

    • {Benchmark}Schema

  3. Register in benchbox/__init__.py

See: Custom Benchmarks Guide

Adding a New Platform

  1. Create benchbox/platforms/{platform}/ directory

  2. Implement:

    • {Platform}Adapter

    • {Platform}Connection(DatabaseConnection)

  3. Add platform extras to pyproject.toml

  4. Register in benchbox/platforms/__init__.py

See: Adding New Platforms

Adding Query Parameter Variants

TPC benchmarks support query variants with different parameter substitutions:

# Get query with random parameters (default)
query = benchmark.get_query("q1")

# Get query with specific parameters
query = benchmark.get_query("q1", params={"date": "1998-09-02", "quantity": 24})

# Generate multiple variants for seed sweep
variants = benchmark.generate_query_variants("q1", count=5, seed_start=42)

See: TPC Patterns Usage

Design Patterns

1. Adapter Pattern

Platform adapters isolate database-specific logic, allowing benchmarks to remain platform-agnostic.

2. Template Method Pattern

BaseBenchmark defines the benchmark execution workflow, with subclasses providing specific implementations.

3. Strategy Pattern

Data loading strategies vary by platform (bulk COPY, INSERT batches, external tables) but implement a common interface.

4. Factory Pattern

get_platform_adapter(name) creates appropriate adapter instances based on configuration.

Performance Considerations

Data Loading Optimization

  • Parquet Format: Columnar storage for fast analytical queries

  • Bulk Loading: Platform-specific optimizations (DuckDB COPY, ClickHouse INSERT, BigQuery external tables)

  • Parallel Loading: Multi-threaded data ingestion where supported

Query Execution Optimization

  • Connection Pooling: Reuse connections across queries

  • Result Streaming: Fetch results incrementally for large result sets

  • Query Compilation: Leverage platform-specific prepared statements

Memory Management

  • Scale Factor Validation: Prevent OOM by validating scale vs. available memory

  • Streaming Results: Don’t materialize full result sets unless needed

  • Cleanup: Explicit connection and resource cleanup

Security Considerations

Credential Management

  • Environment variables for sensitive data

  • Support for platform-specific auth (OAuth, IAM, service accounts)

  • No credentials in configuration files or logs

SQL Injection Prevention

  • Parameterized queries where possible

  • Input validation on query IDs and parameters

  • No direct string concatenation into SQL

Data Privacy

  • Support for synthetic data only (no real customer data)

  • Anonymization support for custom benchmarks

  • Data residency controls for cloud platforms

Testing Architecture

Test Layers

  1. Unit Tests (tests/unit/): Component-level testing

  2. Integration Tests (tests/integration/): Database interaction tests

  3. Live Tests (tests/integration/platforms/): Real cloud platform testing

See: Testing Guide