Python API Reference¶

Complete reference for BenchBox’s Python API, covering benchmarks, platform adapters, results, and utilities.

Overview¶

BenchBox provides a comprehensive Python API for programmatic benchmark execution. The API is organized into several layers:

Benchmark Layer: Abstract interfaces and concrete benchmark implementations
Platform Layer: Database-specific adapters for query execution
Results Layer: Structured result objects and validation
Utilities Layer: Helper functions for common operations

Quick Start¶

Basic benchmark execution:

from benchbox.tpch import TPCH
from benchbox.platforms.duckdb import DuckDBAdapter

# Create benchmark and platform
benchmark = TPCH(scale_factor=0.1)
adapter = DuckDBAdapter()

# Run benchmark
results = benchmark.run_with_platform(adapter)

# Access results
print(f"Completed {results.successful_queries} queries")
print(f"Average time: {results.average_query_time:.3f}s")

API Organization¶

Core APIs¶

Platform Adapters¶

Utilities¶

Performance & Monitoring¶

Common Patterns¶

Data Generation¶

from benchbox.tpcds import TPCDS

# Generate data at specific scale
benchmark = TPCDS(scale_factor=1.0, output_dir="./tpcds_data")
data_files = benchmark.generate_data()

# Reuse generated data
benchmark2 = TPCDS(scale_factor=1.0, output_dir="./tpcds_data")
# Skip regeneration if data exists

Query Access¶

from benchbox.tpch import TPCH

benchmark = TPCH(scale_factor=0.1)

# Get all queries
queries = benchmark.get_queries()

# Get specific query
q1 = benchmark.get_query("q1")

# Get query with parameters
q1_parameterized = benchmark.get_query("q1", params={"date": "1998-09-02"})

Platform Execution¶

from benchbox.clickbench import ClickBench
from benchbox.platforms.clickhouse import ClickHouseAdapter

# Initialize components
benchmark = ClickBench(scale_factor=0.01)
adapter = ClickHouseAdapter(
    host="localhost",
    port=9000,
    database="benchmark"
)

# Run with platform optimizations
results = benchmark.run_with_platform(
    adapter,
    query_subset=["Q1", "Q2", "Q3"]  # Optional filtering
)

Result Analysis¶

from benchbox.core.results.models import BenchmarkResults

# Load results from file
results = BenchmarkResults.from_json_file("results.json")

# Analyze query performance
for qr in results.query_results:
    if qr.status == "SUCCESS":
        print(f"{qr.query_id}: {qr.execution_time:.3f}s")

# Calculate geometric mean
import math
times = [qr.execution_time for qr in results.query_results
         if qr.status == "SUCCESS"]
geomean = math.prod(times) ** (1.0 / len(times))

Cross-Platform Comparison¶

from benchbox.tpch import TPCH
from benchbox.platforms.duckdb import DuckDBAdapter
from benchbox.platforms.clickhouse import ClickHouseAdapter

benchmark = TPCH(scale_factor=1.0)

platforms = {
    "DuckDB": DuckDBAdapter(),
    "ClickHouse": ClickHouseAdapter(host="localhost")
}

results = {}
for name, adapter in platforms.items():
    print(f"Running on {name}...")
    results[name] = benchmark.run_with_platform(adapter)

# Compare performance
for name, result in results.items():
    print(f"{name}: {result.total_execution_time:.2f}s")

Error Handling¶

from benchbox.tpch import TPCH
from benchbox.platforms.duckdb import DuckDBAdapter

try:
    benchmark = TPCH(scale_factor=0.1)
    adapter = DuckDBAdapter()
    results = benchmark.run_with_platform(adapter)

    # Check for query failures
    if results.failed_queries > 0:
        print(f"Warning: {results.failed_queries} queries failed")
        for qr in results.query_results:
            if qr.status == "FAILED":
                print(f"  {qr.query_id}: {qr.error_message}")

except ValueError as e:
    print(f"Configuration error: {e}")
except Exception as e:
    print(f"Execution error: {e}")

Type Hints¶

BenchBox provides comprehensive type hints for IDE support:

from typing import Optional, Dict, Any, List
from benchbox.base import BaseBenchmark
from benchbox.core.results.models import BenchmarkResults

def run_benchmark(
    benchmark: BaseBenchmark,
    adapter,
    config: Optional[Dict[str, Any]] = None
) -> BenchmarkResults:
    """Run benchmark with type-checked parameters."""
    return benchmark.run_with_platform(adapter, **(config or {}))

Configuration Classes¶

Platform adapters accept configuration via constructor parameters:

# DuckDB configuration
from benchbox.platforms.duckdb import DuckDBAdapter

adapter = DuckDBAdapter(
    database_path=":memory:",  # or file path
    memory_limit="4GB",
    thread_limit=4,
    enable_profiling=True
)

# ClickHouse configuration
from benchbox.platforms.clickhouse import ClickHouseAdapter

adapter = ClickHouseAdapter(
    host="localhost",
    port=9000,
    database="benchmark",
    username="default",
    password="",
    settings={
        "max_memory_usage": "8GB",
        "max_threads": 8
    }
)