Platform Comparison Guide¶

Tags intermediate guide sql-platform

This guide covers how to compare benchmark performance across platforms using the unified benchbox compare command. Compare SQL platforms, DataFrame platforms, or both.

Overview¶

BenchBox provides a unified comparison interface supporting:

SQL Platform Comparisons: DuckDB vs SQLite vs PostgreSQL vs ClickHouse
DataFrame Platform Comparisons: Polars vs Pandas vs DataFusion
Mixed Comparisons: Compare SQL and DataFrame platforms together
File-Based Comparisons: Compare previously saved result files
Performance Visualization: Generate charts and reports automatically

Quick Start¶

List Available Platforms¶

# List all available platforms
benchbox compare --list-platforms

Output shows installed SQL and DataFrame platforms:

SQL Platforms:
  Installed:
    duckdb           (embedded, analytical)
    sqlite           (embedded, transactional)
    postgresql       (server, transactional)
    clickhouse       (server, analytical)

  Cloud (requires configuration):
    snowflake        (cloud, analytical)
    bigquery         (cloud, analytical)
    databricks       (cloud, analytical)

DataFrame Platforms:
  Installed:
    polars-df        (expression, single_node) [lazy, streaming]
    pandas-df        (pandas, single_node) [standard]
    datafusion-df    (expression, single_node) [lazy]

  Not installed:
    pyspark-df       (expression, distributed)
    cudf-df          (pandas, gpu_accelerated)

Compare SQL Platforms¶

# Compare DuckDB vs SQLite on TPC-H SF 0.01
benchbox compare -p duckdb -p sqlite --scale 0.01

Compare DataFrame Platforms¶

# Compare Polars vs Pandas
benchbox compare -p polars-df -p pandas-df --scale 0.01

Compare Result Files¶

# Compare previously saved results
benchbox compare results/duckdb.json results/sqlite.json

Interactive Mode¶

# Launch interactive wizard
benchbox compare

The interactive wizard guides you through:

Choosing comparison type (platforms or files)
Selecting platforms from an interactive list
Configuring benchmark options
Running the comparison

Command Reference¶

benchbox compare [OPTIONS] [RESULT_FILES]...

Modes¶

Mode is automatically detected from arguments:

Mode	Trigger	Description
Run Mode	`-p/--platform`	Run benchmarks across platforms then compare
File Mode	`RESULT_FILES`	Compare existing result files
Interactive	No arguments	Launch interactive wizard

Platform Selection¶

Option	Description
`-p, --platform TEXT`	Platforms to compare (triggers run mode). Repeatable
`--type [sql\|dataframe\|auto]`	Platform type (default: auto-detect)
`--list-platforms`	Show available platforms and exit

Benchmark Configuration (run mode)¶

Option	Default	Description
`-b, --benchmark`	`tpch`	Benchmark: tpch, tpcds, ssb, clickbench
`-s, --scale FLOAT`	`0.01`	Scale factor
`-q, --queries TEXT`	all	Comma-separated query IDs (e.g., Q1,Q6,Q10)
`--warmup INTEGER`	`1`	Warmup iterations
`--iterations INTEGER`	`3`	Benchmark iterations
`--timeout FLOAT`	`300`	Per-query timeout (seconds)
`--parallel`	false	Run platforms in parallel

Output Options¶

Option	Default	Description
`-o, --output PATH`	none	Output directory for results
`--format [json\|markdown\|text\|csv]`	`text`	Output format
`--generate-charts`	false	Generate visualization charts
`--theme [light\|dark]`	`light`	Chart theme
`--data-dir DIRECTORY`	auto	Directory containing benchmark data

Query Plan Options (file mode)¶

Option	Default	Description
`--include-plans`	false	Include query plan comparison in output
`--plan-threshold FLOAT`	`0.0`	Only show plans with similarity below threshold (0.0-1.0)

Usage Examples¶

SQL Platform Comparison¶

Compare embedded SQL databases:

benchbox compare \
  -p duckdb \
  -p sqlite \
  -p datafusion \
  --scale 0.1 \
  --iterations 5

Example output:

SQL Platform Comparison
Platforms: duckdb, sqlite, datafusion
Benchmark: tpch @ SF 0.1
Iterations: 5

Running benchmarks...

============================================================
RESULTS
============================================================

Fastest: duckdb
Slowest: sqlite

Platform        Geomean (ms)    Total (ms)    Success
------------------------------------------------------------
duckdb               45.23        1025.34       100%
datafusion           67.89        1538.67       100%
sqlite              234.56        5318.90       100%

Query Winners:
  Q1: duckdb
  Q2: datafusion
  Q3: duckdb
  ...
============================================================

DataFrame Platform Comparison¶

Compare DataFrame libraries:

benchbox compare \
  -p polars-df \
  -p pandas-df \
  -p datafusion-df \
  --scale 0.1

Mixed SQL and DataFrame Comparison¶

Compare across platform types:

benchbox compare \
  -p duckdb \
  -p polars-df \
  --scale 0.1

Note: Mixed comparisons use SQL execution on SQL platforms and DataFrame API on DataFrame platforms. Results compare execution times for equivalent queries.

Cloud Platform Comparison¶

Compare cloud data warehouses (requires credentials):

benchbox compare \
  -p snowflake \
  -p bigquery \
  -p redshift \
  --benchmark tpch \
  --scale 1 \
  --timeout 600

Generate Reports and Charts¶

Save results with visualization:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --scale 1 \
  --output ./comparison_results \
  --format markdown \
  --generate-charts \
  --theme dark

Creates:

comparison_results/
├── comparison.md
├── comparison.json
└── charts/
    ├── sql_performance.png
    ├── sql_performance.html
    ├── sql_distribution.png
    └── sql_query_heatmap.png

Compare Result Files¶

Compare previously saved benchmark results:

# Export results from individual runs
benchbox run --platform duckdb --benchmark tpch --scale 0.1 -o results/duckdb
benchbox run --platform sqlite --benchmark tpch --scale 0.1 -o results/sqlite

# Compare the result files
benchbox compare results/duckdb/results.json results/sqlite/results.json

Specific Query Subset¶

Run only specific queries for faster iteration:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --queries Q1,Q6,Q14,Q17 \
  --iterations 10

Parallel Execution¶

Run platforms in parallel for faster comparisons:

benchbox compare \
  -p duckdb \
  -p sqlite \
  -p clickhouse \
  --parallel \
  --iterations 3

JSON Output for Automation¶

Export results for programmatic analysis:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --format json \
  --output ./results

The JSON output includes:

{
  "config": {
    "platform_type": "sql",
    "benchmark": "tpch",
    "scale_factor": 0.01,
    "iterations": 3
  },
  "results": [
    {
      "platform": "duckdb",
      "platform_type": "sql",
      "query_results": [...],
      "geometric_mean_ms": 45.23,
      "total_time_ms": 1025.34,
      "success_rate": 100.0
    },
    ...
  ],
  "summary": {
    "platforms": ["duckdb", "sqlite"],
    "fastest_platform": "duckdb",
    "slowest_platform": "sqlite",
    "speedup_ratio": 5.18,
    "query_winners": {"Q1": "duckdb", ...}
  }
}

Programmatic API¶

For custom analysis, use the Python API directly:

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedBenchmarkConfig,
    PlatformType,
    run_unified_comparison,
)

Quick Comparison¶

from benchbox.core.comparison import run_unified_comparison

# Run quick comparison
results = run_unified_comparison(
    platforms=["duckdb", "sqlite"],
    scale_factor=0.01,
)

for result in results:
    print(f"{result.platform}: {result.geometric_mean_ms:.2f}ms")

Full Suite with Configuration¶

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedBenchmarkConfig,
    PlatformType,
)

# Configure benchmark
config = UnifiedBenchmarkConfig(
    platform_type=PlatformType.SQL,
    scale_factor=0.1,
    benchmark="tpch",
    query_ids=["Q1", "Q6", "Q10"],
    warmup_iterations=2,
    benchmark_iterations=5,
)

# Create suite
suite = UnifiedBenchmarkSuite(config=config)

# Run comparison
results = suite.run_comparison(platforms=["duckdb", "sqlite"])

# Get summary
summary = suite.get_summary(results)
print(f"Fastest: {summary.fastest_platform}")
print(f"Speedup ratio: {summary.speedup_ratio:.2f}x")

Generate Charts Programmatically¶

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedComparisonPlotter,
)

# Run comparison
suite = UnifiedBenchmarkSuite()
results = suite.run_comparison(["duckdb", "sqlite"])

# Generate charts
plotter = UnifiedComparisonPlotter(results, theme="dark")
exports = plotter.generate_charts(
    output_dir="charts/",
    formats=["png", "html", "svg"],
    dpi=300,
)

for chart_type, paths in exports.items():
    print(f"{chart_type}: {paths}")

Platform Type Detection¶

from benchbox.core.comparison import detect_platform_type, detect_platform_types

# Single platform
platform_type = detect_platform_type("polars-df")
# Returns: PlatformType.DATAFRAME

# Multiple platforms
detected, inconsistent = detect_platform_types(["duckdb", "sqlite", "polars-df"])
# detected: PlatformType.SQL (majority)
# inconsistent: ["polars-df"]

Platform Categories¶

SQL Platforms¶

Category	Platforms	Use Case
Embedded	DuckDB, SQLite, DataFusion	Local analysis, testing
Server	PostgreSQL, ClickHouse	Production workloads
Cloud	Snowflake, BigQuery, Redshift, Databricks	Enterprise, large scale

DataFrame Platforms¶

Category	Platforms	Use Case
Single Node	Polars, Pandas, DataFusion	In-memory, medium datasets
Distributed	PySpark, Dask, Modin	Large datasets, cluster
GPU Accelerated	cuDF	CUDA GPU acceleration

Best Practices¶

1. Start Small¶

Begin with SF 0.01 to verify everything works:

benchbox compare -p duckdb -p sqlite --scale 0.01

2. Use Multiple Iterations¶

Reduce variance with multiple iterations:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --iterations 5 \
  --warmup 2

3. Match Your Production Scale¶

Data Size	Recommended SF
< 1 GB	0.01 - 0.1
1-10 GB	0.1 - 1
10-100 GB	1 - 10
> 100 GB	10+

4. Compare Similar Platforms¶

For meaningful comparisons:

Compare platforms with similar deployment models
Use same hardware/resources
Run at the same scale factor

5. Document Your Results¶

Use markdown output for documentation:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --scale 1 \
  --format markdown \
  --output ./docs/benchmark_results

Interpreting Results¶

Geometric Mean¶

The geometric mean summarizes performance across all queries. Lower is better. It’s less sensitive to outliers than arithmetic mean.

Success Rate¶

Percentage of queries that completed successfully. 100% expected for production-ready platforms.

Query Winners¶

Shows which platform was fastest for each query:

Different platforms excel at different query types
Aggregation vs joins vs filtering
Use for workload-specific decisions

Speedup Ratio¶

Ratio between fastest and slowest platform:

> 1.0: Faster platform is significantly better
~1.0: Platforms are approximately equal
Large ratios (5x+) indicate major performance differences

Troubleshooting¶

Data Not Found¶

Error: Data directory not found: benchmark_runs/tpch/sf001/data

Generate data first:

benchbox run --platform duckdb --benchmark tpch --scale 0.01

Platform Not Available¶

Warning: Platform clickhouse not available

Install the required extra:

uv add benchbox --extra clickhouse

Mixed Platform Types¶

Error: Cannot mix SQL and DataFrame platforms. Use --type to specify.

Either:

Compare only same-type platforms
Explicitly set --type sql or --type dataframe

Memory Issues at Large Scale¶

For large scale factors:

Use streaming-capable platforms (Polars, DuckDB)
Reduce concurrent platforms compared
Use --parallel=false for sequential execution

Query Plan Comparison¶

When comparing result files, you can include query plan analysis to understand why performance changed, not just that it changed.

Include Plan Analysis¶

# Compare files with plan analysis
benchbox compare baseline.json current.json --include-plans

# Only show plans that changed significantly (< 90% similar)
benchbox compare baseline.json current.json --include-plans --plan-threshold 0.9

Plan Comparison Output¶

The plan comparison section shows:

Column	Description
Similarity	Overall plan similarity (0-100%)
Type Δ	Number of operator type mismatches
Prop Δ	Number of property mismatches
Perf Δ	Performance change (positive = slower)
Status	Plan status (Identical, Similar, Different, REGRESSION)

Plan-Correlated Regressions¶

The most valuable insight is plan-correlated regressions: queries where:

The query plan changed (< 100% similarity)
Performance degraded > 20%

These are highlighted specially because optimizer changes likely caused the regression.

Requirements¶

Plan comparison requires benchmark results captured with --capture-plans:

# Capture plans during benchmark runs
benchbox run --platform duckdb --benchmark tpch --capture-plans -o baseline
benchbox run --platform duckdb --benchmark tpch --capture-plans -o current

# Compare with plan analysis
benchbox compare baseline/results.json current/results.json --include-plans

Migration from compare-dataframes¶

The compare-dataframes command is deprecated. Migrate to compare:

# OLD
benchbox compare-dataframes -p polars-df -p pandas-df

# NEW
benchbox compare -p polars-df -p pandas-df

# OLD (SQL vs DataFrame)
benchbox compare-dataframes -p polars-df --vs-sql duckdb

# NEW
benchbox compare -p polars-df -p duckdb

Migration from compare-plans¶

The compare-plans command is deprecated. Migrate to compare --include-plans:

# OLD
benchbox compare-plans --run1 before.json --run2 after.json

# NEW
benchbox compare before.json after.json --include-plans

# OLD with threshold
benchbox compare-plans --run1 before.json --run2 after.json --threshold 0.9

# NEW with threshold
benchbox compare before.json after.json --include-plans --plan-threshold 0.9

Platform Comparison Guide¶

Overview¶

Quick Start¶

List Available Platforms¶

Compare SQL Platforms¶

Compare DataFrame Platforms¶

Compare Result Files¶

Interactive Mode¶

Command Reference¶

Modes¶

Platform Selection¶

Benchmark Configuration (run mode)¶

Output Options¶

Query Plan Options (file mode)¶

Usage Examples¶

SQL Platform Comparison¶

DataFrame Platform Comparison¶

Mixed SQL and DataFrame Comparison¶

Cloud Platform Comparison¶

Generate Reports and Charts¶

Compare Result Files¶

Specific Query Subset¶

Parallel Execution¶

JSON Output for Automation¶

Programmatic API¶

Quick Comparison¶

Full Suite with Configuration¶

Generate Charts Programmatically¶

Platform Type Detection¶

Platform Categories¶

SQL Platforms¶

DataFrame Platforms¶

Best Practices¶

1. Start Small¶

2. Use Multiple Iterations¶

3. Match Your Production Scale¶

4. Compare Similar Platforms¶

5. Document Your Results¶

Interpreting Results¶

Geometric Mean¶

Success Rate¶

Query Winners¶

Speedup Ratio¶

Troubleshooting¶

Data Not Found¶

Platform Not Available¶

Mixed Platform Types¶

Memory Issues at Large Scale¶

Query Plan Comparison¶

Include Plan Analysis¶

Plan Comparison Output¶

Plan-Correlated Regressions¶

Requirements¶

Migration from compare-dataframes¶

Migration from compare-plans¶

Related Documentation¶