Platform Comparison Guide¶
This guide covers how to compare benchmark performance across platforms using the unified benchbox compare command. Compare SQL platforms, DataFrame platforms, or both.
Overview¶
BenchBox provides a unified comparison interface supporting:
SQL Platform Comparisons: DuckDB vs SQLite vs PostgreSQL vs ClickHouse
DataFrame Platform Comparisons: Polars vs Pandas vs DataFusion
Mixed Comparisons: Compare SQL and DataFrame platforms together
File-Based Comparisons: Compare previously saved result files
Performance Visualization: Generate charts and reports automatically
Quick Start¶
List Available Platforms¶
# List all available platforms
benchbox compare --list-platforms
Output shows installed SQL and DataFrame platforms:
SQL Platforms:
Installed:
duckdb (embedded, analytical)
sqlite (embedded, transactional)
postgresql (server, transactional)
clickhouse (server, analytical)
Cloud (requires configuration):
snowflake (cloud, analytical)
bigquery (cloud, analytical)
databricks (cloud, analytical)
DataFrame Platforms:
Installed:
polars-df (expression, single_node) [lazy, streaming]
pandas-df (pandas, single_node) [standard]
datafusion-df (expression, single_node) [lazy]
Not installed:
pyspark-df (expression, distributed)
cudf-df (pandas, gpu_accelerated)
Compare SQL Platforms¶
# Compare DuckDB vs SQLite on TPC-H SF 0.01
benchbox compare -p duckdb -p sqlite --scale 0.01
Compare DataFrame Platforms¶
# Compare Polars vs Pandas
benchbox compare -p polars-df -p pandas-df --scale 0.01
Compare Result Files¶
# Compare previously saved results
benchbox compare results/duckdb.json results/sqlite.json
Interactive Mode¶
# Launch interactive wizard
benchbox compare
The interactive wizard guides you through:
Choosing comparison type (platforms or files)
Selecting platforms from an interactive list
Configuring benchmark options
Running the comparison
Command Reference¶
benchbox compare [OPTIONS] [RESULT_FILES]...
Modes¶
Mode is automatically detected from arguments:
Mode |
Trigger |
Description |
|---|---|---|
Run Mode |
|
Run benchmarks across platforms then compare |
File Mode |
|
Compare existing result files |
Interactive |
No arguments |
Launch interactive wizard |
Platform Selection¶
Option |
Description |
|---|---|
|
Platforms to compare (triggers run mode). Repeatable |
|
Platform type (default: auto-detect) |
|
Show available platforms and exit |
Benchmark Configuration (run mode)¶
Option |
Default |
Description |
|---|---|---|
|
|
Benchmark: tpch, tpcds, ssb, clickbench |
|
|
Scale factor |
|
all |
Comma-separated query IDs (e.g., Q1,Q6,Q10) |
|
|
Warmup iterations |
|
|
Benchmark iterations |
|
|
Per-query timeout (seconds) |
|
false |
Run platforms in parallel |
Output Options¶
Option |
Default |
Description |
|---|---|---|
|
none |
Output directory for results |
|
|
Output format |
|
false |
Generate visualization charts |
|
|
Chart theme |
|
auto |
Directory containing benchmark data |
Query Plan Options (file mode)¶
Option |
Default |
Description |
|---|---|---|
|
false |
Include query plan comparison in output |
|
|
Only show plans with similarity below threshold (0.0-1.0) |
Usage Examples¶
SQL Platform Comparison¶
Compare embedded SQL databases:
benchbox compare \
-p duckdb \
-p sqlite \
-p datafusion \
--scale 0.1 \
--iterations 5
Example output:
SQL Platform Comparison
Platforms: duckdb, sqlite, datafusion
Benchmark: tpch @ SF 0.1
Iterations: 5
Running benchmarks...
============================================================
RESULTS
============================================================
Fastest: duckdb
Slowest: sqlite
Platform Geomean (ms) Total (ms) Success
------------------------------------------------------------
duckdb 45.23 1025.34 100%
datafusion 67.89 1538.67 100%
sqlite 234.56 5318.90 100%
Query Winners:
Q1: duckdb
Q2: datafusion
Q3: duckdb
...
============================================================
DataFrame Platform Comparison¶
Compare DataFrame libraries:
benchbox compare \
-p polars-df \
-p pandas-df \
-p datafusion-df \
--scale 0.1
Mixed SQL and DataFrame Comparison¶
Compare across platform types:
benchbox compare \
-p duckdb \
-p polars-df \
--scale 0.1
Note: Mixed comparisons use SQL execution on SQL platforms and DataFrame API on DataFrame platforms. Results compare execution times for equivalent queries.
Cloud Platform Comparison¶
Compare cloud data warehouses (requires credentials):
benchbox compare \
-p snowflake \
-p bigquery \
-p redshift \
--benchmark tpch \
--scale 1 \
--timeout 600
Generate Reports and Charts¶
Save results with visualization:
benchbox compare \
-p duckdb \
-p sqlite \
--scale 1 \
--output ./comparison_results \
--format markdown \
--generate-charts \
--theme dark
Creates:
comparison_results/
├── comparison.md
├── comparison.json
└── charts/
├── sql_performance.png
├── sql_performance.html
├── sql_distribution.png
└── sql_query_heatmap.png
Compare Result Files¶
Compare previously saved benchmark results:
# Export results from individual runs
benchbox run --platform duckdb --benchmark tpch --scale 0.1 -o results/duckdb
benchbox run --platform sqlite --benchmark tpch --scale 0.1 -o results/sqlite
# Compare the result files
benchbox compare results/duckdb/results.json results/sqlite/results.json
Specific Query Subset¶
Run only specific queries for faster iteration:
benchbox compare \
-p duckdb \
-p sqlite \
--queries Q1,Q6,Q14,Q17 \
--iterations 10
Parallel Execution¶
Run platforms in parallel for faster comparisons:
benchbox compare \
-p duckdb \
-p sqlite \
-p clickhouse \
--parallel \
--iterations 3
JSON Output for Automation¶
Export results for programmatic analysis:
benchbox compare \
-p duckdb \
-p sqlite \
--format json \
--output ./results
The JSON output includes:
{
"config": {
"platform_type": "sql",
"benchmark": "tpch",
"scale_factor": 0.01,
"iterations": 3
},
"results": [
{
"platform": "duckdb",
"platform_type": "sql",
"query_results": [...],
"geometric_mean_ms": 45.23,
"total_time_ms": 1025.34,
"success_rate": 100.0
},
...
],
"summary": {
"platforms": ["duckdb", "sqlite"],
"fastest_platform": "duckdb",
"slowest_platform": "sqlite",
"speedup_ratio": 5.18,
"query_winners": {"Q1": "duckdb", ...}
}
}
Programmatic API¶
For custom analysis, use the Python API directly:
from benchbox.core.comparison import (
UnifiedBenchmarkSuite,
UnifiedBenchmarkConfig,
PlatformType,
run_unified_comparison,
)
Quick Comparison¶
from benchbox.core.comparison import run_unified_comparison
# Run quick comparison
results = run_unified_comparison(
platforms=["duckdb", "sqlite"],
scale_factor=0.01,
)
for result in results:
print(f"{result.platform}: {result.geometric_mean_ms:.2f}ms")
Full Suite with Configuration¶
from benchbox.core.comparison import (
UnifiedBenchmarkSuite,
UnifiedBenchmarkConfig,
PlatformType,
)
# Configure benchmark
config = UnifiedBenchmarkConfig(
platform_type=PlatformType.SQL,
scale_factor=0.1,
benchmark="tpch",
query_ids=["Q1", "Q6", "Q10"],
warmup_iterations=2,
benchmark_iterations=5,
)
# Create suite
suite = UnifiedBenchmarkSuite(config=config)
# Run comparison
results = suite.run_comparison(platforms=["duckdb", "sqlite"])
# Get summary
summary = suite.get_summary(results)
print(f"Fastest: {summary.fastest_platform}")
print(f"Speedup ratio: {summary.speedup_ratio:.2f}x")
Generate Charts Programmatically¶
from benchbox.core.comparison import (
UnifiedBenchmarkSuite,
UnifiedComparisonPlotter,
)
# Run comparison
suite = UnifiedBenchmarkSuite()
results = suite.run_comparison(["duckdb", "sqlite"])
# Generate charts
plotter = UnifiedComparisonPlotter(results, theme="dark")
exports = plotter.generate_charts(
output_dir="charts/",
formats=["png", "html", "svg"],
dpi=300,
)
for chart_type, paths in exports.items():
print(f"{chart_type}: {paths}")
Platform Type Detection¶
from benchbox.core.comparison import detect_platform_type, detect_platform_types
# Single platform
platform_type = detect_platform_type("polars-df")
# Returns: PlatformType.DATAFRAME
# Multiple platforms
detected, inconsistent = detect_platform_types(["duckdb", "sqlite", "polars-df"])
# detected: PlatformType.SQL (majority)
# inconsistent: ["polars-df"]
Platform Categories¶
SQL Platforms¶
Category |
Platforms |
Use Case |
|---|---|---|
Embedded |
DuckDB, SQLite, DataFusion |
Local analysis, testing |
Server |
PostgreSQL, ClickHouse |
Production workloads |
Cloud |
Snowflake, BigQuery, Redshift, Databricks |
Enterprise, large scale |
DataFrame Platforms¶
Category |
Platforms |
Use Case |
|---|---|---|
Single Node |
Polars, Pandas, DataFusion |
In-memory, medium datasets |
Distributed |
PySpark, Dask, Modin |
Large datasets, cluster |
GPU Accelerated |
cuDF |
CUDA GPU acceleration |
Best Practices¶
1. Start Small¶
Begin with SF 0.01 to verify everything works:
benchbox compare -p duckdb -p sqlite --scale 0.01
2. Use Multiple Iterations¶
Reduce variance with multiple iterations:
benchbox compare \
-p duckdb \
-p sqlite \
--iterations 5 \
--warmup 2
3. Match Your Production Scale¶
Data Size |
Recommended SF |
|---|---|
< 1 GB |
0.01 - 0.1 |
1-10 GB |
0.1 - 1 |
10-100 GB |
1 - 10 |
> 100 GB |
10+ |
4. Compare Similar Platforms¶
For meaningful comparisons:
Compare platforms with similar deployment models
Use same hardware/resources
Run at the same scale factor
5. Document Your Results¶
Use markdown output for documentation:
benchbox compare \
-p duckdb \
-p sqlite \
--scale 1 \
--format markdown \
--output ./docs/benchmark_results
Interpreting Results¶
Geometric Mean¶
The geometric mean summarizes performance across all queries. Lower is better. It’s less sensitive to outliers than arithmetic mean.
Success Rate¶
Percentage of queries that completed successfully. 100% expected for production-ready platforms.
Query Winners¶
Shows which platform was fastest for each query:
Different platforms excel at different query types
Aggregation vs joins vs filtering
Use for workload-specific decisions
Speedup Ratio¶
Ratio between fastest and slowest platform:
> 1.0: Faster platform is significantly better~1.0: Platforms are approximately equalLarge ratios (5x+) indicate major performance differences
Troubleshooting¶
Data Not Found¶
Error: Data directory not found: benchmark_runs/tpch/sf001/data
Generate data first:
benchbox run --platform duckdb --benchmark tpch --scale 0.01
Platform Not Available¶
Warning: Platform clickhouse not available
Install the required extra:
uv add benchbox --extra clickhouse
Mixed Platform Types¶
Error: Cannot mix SQL and DataFrame platforms. Use --type to specify.
Either:
Compare only same-type platforms
Explicitly set
--type sqlor--type dataframe
Memory Issues at Large Scale¶
For large scale factors:
Use streaming-capable platforms (Polars, DuckDB)
Reduce concurrent platforms compared
Use
--parallel=falsefor sequential execution
Query Plan Comparison¶
When comparing result files, you can include query plan analysis to understand why performance changed, not just that it changed.
Include Plan Analysis¶
# Compare files with plan analysis
benchbox compare baseline.json current.json --include-plans
# Only show plans that changed significantly (< 90% similar)
benchbox compare baseline.json current.json --include-plans --plan-threshold 0.9
Plan Comparison Output¶
The plan comparison section shows:
Column |
Description |
|---|---|
Similarity |
Overall plan similarity (0-100%) |
Type Δ |
Number of operator type mismatches |
Prop Δ |
Number of property mismatches |
Perf Δ |
Performance change (positive = slower) |
Status |
Plan status (Identical, Similar, Different, REGRESSION) |
Requirements¶
Plan comparison requires benchmark results captured with --capture-plans:
# Capture plans during benchmark runs
benchbox run --platform duckdb --benchmark tpch --capture-plans -o baseline
benchbox run --platform duckdb --benchmark tpch --capture-plans -o current
# Compare with plan analysis
benchbox compare baseline/results.json current/results.json --include-plans
Migration from compare-dataframes¶
The compare-dataframes command is deprecated. Migrate to compare:
# OLD
benchbox compare-dataframes -p polars-df -p pandas-df
# NEW
benchbox compare -p polars-df -p pandas-df
# OLD (SQL vs DataFrame)
benchbox compare-dataframes -p polars-df --vs-sql duckdb
# NEW
benchbox compare -p polars-df -p duckdb
Migration from compare-plans¶
The compare-plans command is deprecated. Migrate to compare --include-plans:
# OLD
benchbox compare-plans --run1 before.json --run2 after.json
# NEW
benchbox compare before.json after.json --include-plans
# OLD with threshold
benchbox compare-plans --run1 before.json --run2 after.json --threshold 0.9
# NEW with threshold
benchbox compare before.json after.json --include-plans --plan-threshold 0.9