Platform Comparison Guide

Tags intermediate guide sql-platform

This guide covers how to compare benchmark performance across platforms using the unified benchbox compare command. Compare SQL platforms, DataFrame platforms, or both.

Overview

BenchBox provides a unified comparison interface supporting:

  • SQL Platform Comparisons: DuckDB vs SQLite vs PostgreSQL vs ClickHouse

  • DataFrame Platform Comparisons: Polars vs Pandas vs DataFusion

  • Mixed Comparisons: Compare SQL and DataFrame platforms together

  • File-Based Comparisons: Compare previously saved result files

  • Performance Visualization: Generate charts and reports automatically

Quick Start

List Available Platforms

# List all available platforms
benchbox compare --list-platforms

Output shows installed SQL and DataFrame platforms:

SQL Platforms:
  Installed:
    duckdb           (embedded, analytical)
    sqlite           (embedded, transactional)
    postgresql       (server, transactional)
    clickhouse       (server, analytical)

  Cloud (requires configuration):
    snowflake        (cloud, analytical)
    bigquery         (cloud, analytical)
    databricks       (cloud, analytical)

DataFrame Platforms:
  Installed:
    polars-df        (expression, single_node) [lazy, streaming]
    pandas-df        (pandas, single_node) [standard]
    datafusion-df    (expression, single_node) [lazy]

  Not installed:
    pyspark-df       (expression, distributed)
    cudf-df          (pandas, gpu_accelerated)

Compare SQL Platforms

# Compare DuckDB vs SQLite on TPC-H SF 0.01
benchbox compare -p duckdb -p sqlite --scale 0.01

Compare DataFrame Platforms

# Compare Polars vs Pandas
benchbox compare -p polars-df -p pandas-df --scale 0.01

Compare Result Files

# Compare previously saved results
benchbox compare results/duckdb.json results/sqlite.json

Interactive Mode

# Launch interactive wizard
benchbox compare

The interactive wizard guides you through:

  1. Choosing comparison type (platforms or files)

  2. Selecting platforms from an interactive list

  3. Configuring benchmark options

  4. Running the comparison

Command Reference

benchbox compare [OPTIONS] [RESULT_FILES]...

Modes

Mode is automatically detected from arguments:

Mode

Trigger

Description

Run Mode

-p/--platform

Run benchmarks across platforms then compare

File Mode

RESULT_FILES

Compare existing result files

Interactive

No arguments

Launch interactive wizard

Platform Selection

Option

Description

-p, --platform TEXT

Platforms to compare (triggers run mode). Repeatable

--type [sql|dataframe|auto]

Platform type (default: auto-detect)

--list-platforms

Show available platforms and exit

Benchmark Configuration (run mode)

Option

Default

Description

-b, --benchmark

tpch

Benchmark: tpch, tpcds, ssb, clickbench

-s, --scale FLOAT

0.01

Scale factor

-q, --queries TEXT

all

Comma-separated query IDs (e.g., Q1,Q6,Q10)

--warmup INTEGER

1

Warmup iterations

--iterations INTEGER

3

Benchmark iterations

--timeout FLOAT

300

Per-query timeout (seconds)

--parallel

false

Run platforms in parallel

Output Options

Option

Default

Description

-o, --output PATH

none

Output directory for results

--format [json|markdown|text|csv]

text

Output format

--generate-charts

false

Generate visualization charts

--theme [light|dark]

light

Chart theme

--data-dir DIRECTORY

auto

Directory containing benchmark data

Query Plan Options (file mode)

Option

Default

Description

--include-plans

false

Include query plan comparison in output

--plan-threshold FLOAT

0.0

Only show plans with similarity below threshold (0.0-1.0)

Usage Examples

SQL Platform Comparison

Compare embedded SQL databases:

benchbox compare \
  -p duckdb \
  -p sqlite \
  -p datafusion \
  --scale 0.1 \
  --iterations 5

Example output:

SQL Platform Comparison
Platforms: duckdb, sqlite, datafusion
Benchmark: tpch @ SF 0.1
Iterations: 5

Running benchmarks...

============================================================
RESULTS
============================================================

Fastest: duckdb
Slowest: sqlite

Platform        Geomean (ms)    Total (ms)    Success
------------------------------------------------------------
duckdb               45.23        1025.34       100%
datafusion           67.89        1538.67       100%
sqlite              234.56        5318.90       100%

Query Winners:
  Q1: duckdb
  Q2: datafusion
  Q3: duckdb
  ...
============================================================

DataFrame Platform Comparison

Compare DataFrame libraries:

benchbox compare \
  -p polars-df \
  -p pandas-df \
  -p datafusion-df \
  --scale 0.1

Mixed SQL and DataFrame Comparison

Compare across platform types:

benchbox compare \
  -p duckdb \
  -p polars-df \
  --scale 0.1

Note: Mixed comparisons use SQL execution on SQL platforms and DataFrame API on DataFrame platforms. Results compare execution times for equivalent queries.

Cloud Platform Comparison

Compare cloud data warehouses (requires credentials):

benchbox compare \
  -p snowflake \
  -p bigquery \
  -p redshift \
  --benchmark tpch \
  --scale 1 \
  --timeout 600

Generate Reports and Charts

Save results with visualization:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --scale 1 \
  --output ./comparison_results \
  --format markdown \
  --generate-charts \
  --theme dark

Creates:

comparison_results/
├── comparison.md
├── comparison.json
└── charts/
    ├── sql_performance.png
    ├── sql_performance.html
    ├── sql_distribution.png
    └── sql_query_heatmap.png

Compare Result Files

Compare previously saved benchmark results:

# Export results from individual runs
benchbox run --platform duckdb --benchmark tpch --scale 0.1 -o results/duckdb
benchbox run --platform sqlite --benchmark tpch --scale 0.1 -o results/sqlite

# Compare the result files
benchbox compare results/duckdb/results.json results/sqlite/results.json

Specific Query Subset

Run only specific queries for faster iteration:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --queries Q1,Q6,Q14,Q17 \
  --iterations 10

Parallel Execution

Run platforms in parallel for faster comparisons:

benchbox compare \
  -p duckdb \
  -p sqlite \
  -p clickhouse \
  --parallel \
  --iterations 3

JSON Output for Automation

Export results for programmatic analysis:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --format json \
  --output ./results

The JSON output includes:

{
  "config": {
    "platform_type": "sql",
    "benchmark": "tpch",
    "scale_factor": 0.01,
    "iterations": 3
  },
  "results": [
    {
      "platform": "duckdb",
      "platform_type": "sql",
      "query_results": [...],
      "geometric_mean_ms": 45.23,
      "total_time_ms": 1025.34,
      "success_rate": 100.0
    },
    ...
  ],
  "summary": {
    "platforms": ["duckdb", "sqlite"],
    "fastest_platform": "duckdb",
    "slowest_platform": "sqlite",
    "speedup_ratio": 5.18,
    "query_winners": {"Q1": "duckdb", ...}
  }
}

Programmatic API

For custom analysis, use the Python API directly:

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedBenchmarkConfig,
    PlatformType,
    run_unified_comparison,
)

Quick Comparison

from benchbox.core.comparison import run_unified_comparison

# Run quick comparison
results = run_unified_comparison(
    platforms=["duckdb", "sqlite"],
    scale_factor=0.01,
)

for result in results:
    print(f"{result.platform}: {result.geometric_mean_ms:.2f}ms")

Full Suite with Configuration

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedBenchmarkConfig,
    PlatformType,
)

# Configure benchmark
config = UnifiedBenchmarkConfig(
    platform_type=PlatformType.SQL,
    scale_factor=0.1,
    benchmark="tpch",
    query_ids=["Q1", "Q6", "Q10"],
    warmup_iterations=2,
    benchmark_iterations=5,
)

# Create suite
suite = UnifiedBenchmarkSuite(config=config)

# Run comparison
results = suite.run_comparison(platforms=["duckdb", "sqlite"])

# Get summary
summary = suite.get_summary(results)
print(f"Fastest: {summary.fastest_platform}")
print(f"Speedup ratio: {summary.speedup_ratio:.2f}x")

Generate Charts Programmatically

from benchbox.core.comparison import (
    UnifiedBenchmarkSuite,
    UnifiedComparisonPlotter,
)

# Run comparison
suite = UnifiedBenchmarkSuite()
results = suite.run_comparison(["duckdb", "sqlite"])

# Generate charts
plotter = UnifiedComparisonPlotter(results, theme="dark")
exports = plotter.generate_charts(
    output_dir="charts/",
    formats=["png", "html", "svg"],
    dpi=300,
)

for chart_type, paths in exports.items():
    print(f"{chart_type}: {paths}")

Platform Type Detection

from benchbox.core.comparison import detect_platform_type, detect_platform_types

# Single platform
platform_type = detect_platform_type("polars-df")
# Returns: PlatformType.DATAFRAME

# Multiple platforms
detected, inconsistent = detect_platform_types(["duckdb", "sqlite", "polars-df"])
# detected: PlatformType.SQL (majority)
# inconsistent: ["polars-df"]

Platform Categories

SQL Platforms

Category

Platforms

Use Case

Embedded

DuckDB, SQLite, DataFusion

Local analysis, testing

Server

PostgreSQL, ClickHouse

Production workloads

Cloud

Snowflake, BigQuery, Redshift, Databricks

Enterprise, large scale

DataFrame Platforms

Category

Platforms

Use Case

Single Node

Polars, Pandas, DataFusion

In-memory, medium datasets

Distributed

PySpark, Dask, Modin

Large datasets, cluster

GPU Accelerated

cuDF

CUDA GPU acceleration

Best Practices

1. Start Small

Begin with SF 0.01 to verify everything works:

benchbox compare -p duckdb -p sqlite --scale 0.01

2. Use Multiple Iterations

Reduce variance with multiple iterations:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --iterations 5 \
  --warmup 2

3. Match Your Production Scale

Data Size

Recommended SF

< 1 GB

0.01 - 0.1

1-10 GB

0.1 - 1

10-100 GB

1 - 10

> 100 GB

10+

4. Compare Similar Platforms

For meaningful comparisons:

  • Compare platforms with similar deployment models

  • Use same hardware/resources

  • Run at the same scale factor

5. Document Your Results

Use markdown output for documentation:

benchbox compare \
  -p duckdb \
  -p sqlite \
  --scale 1 \
  --format markdown \
  --output ./docs/benchmark_results

Interpreting Results

Geometric Mean

The geometric mean summarizes performance across all queries. Lower is better. It’s less sensitive to outliers than arithmetic mean.

Success Rate

Percentage of queries that completed successfully. 100% expected for production-ready platforms.

Query Winners

Shows which platform was fastest for each query:

  • Different platforms excel at different query types

  • Aggregation vs joins vs filtering

  • Use for workload-specific decisions

Speedup Ratio

Ratio between fastest and slowest platform:

  • > 1.0: Faster platform is significantly better

  • ~1.0: Platforms are approximately equal

  • Large ratios (5x+) indicate major performance differences

Troubleshooting

Data Not Found

Error: Data directory not found: benchmark_runs/tpch/sf001/data

Generate data first:

benchbox run --platform duckdb --benchmark tpch --scale 0.01

Platform Not Available

Warning: Platform clickhouse not available

Install the required extra:

uv add benchbox --extra clickhouse

Mixed Platform Types

Error: Cannot mix SQL and DataFrame platforms. Use --type to specify.

Either:

  1. Compare only same-type platforms

  2. Explicitly set --type sql or --type dataframe

Memory Issues at Large Scale

For large scale factors:

  • Use streaming-capable platforms (Polars, DuckDB)

  • Reduce concurrent platforms compared

  • Use --parallel=false for sequential execution

Query Plan Comparison

When comparing result files, you can include query plan analysis to understand why performance changed, not just that it changed.

Include Plan Analysis

# Compare files with plan analysis
benchbox compare baseline.json current.json --include-plans

# Only show plans that changed significantly (< 90% similar)
benchbox compare baseline.json current.json --include-plans --plan-threshold 0.9

Plan Comparison Output

The plan comparison section shows:

Column

Description

Similarity

Overall plan similarity (0-100%)

Type Δ

Number of operator type mismatches

Prop Δ

Number of property mismatches

Perf Δ

Performance change (positive = slower)

Status

Plan status (Identical, Similar, Different, REGRESSION)

Plan-Correlated Regressions

The most valuable insight is plan-correlated regressions: queries where:

  1. The query plan changed (< 100% similarity)

  2. Performance degraded > 20%

These are highlighted specially because optimizer changes likely caused the regression.

Requirements

Plan comparison requires benchmark results captured with --capture-plans:

# Capture plans during benchmark runs
benchbox run --platform duckdb --benchmark tpch --capture-plans -o baseline
benchbox run --platform duckdb --benchmark tpch --capture-plans -o current

# Compare with plan analysis
benchbox compare baseline/results.json current/results.json --include-plans

Migration from compare-dataframes

The compare-dataframes command is deprecated. Migrate to compare:

# OLD
benchbox compare-dataframes -p polars-df -p pandas-df

# NEW
benchbox compare -p polars-df -p pandas-df
# OLD (SQL vs DataFrame)
benchbox compare-dataframes -p polars-df --vs-sql duckdb

# NEW
benchbox compare -p polars-df -p duckdb

Migration from compare-plans

The compare-plans command is deprecated. Migrate to compare --include-plans:

# OLD
benchbox compare-plans --run1 before.json --run2 after.json

# NEW
benchbox compare before.json after.json --include-plans

# OLD with threshold
benchbox compare-plans --run1 before.json --run2 after.json --threshold 0.9

# NEW with threshold
benchbox compare before.json after.json --include-plans --plan-threshold 0.9