Platform Comparison Guide¶
This guide covers how to compare benchmark performance across platforms using the unified benchbox compare command. Compare SQL platforms, DataFrame platforms, or both.
Overview¶
BenchBox provides a unified comparison interface supporting:
SQL Platform Comparisons: DuckDB vs SQLite vs PostgreSQL vs ClickHouse
DataFrame Platform Comparisons: Polars vs Pandas vs DataFusion
Mixed Comparisons: Compare SQL and DataFrame platforms together
File-Based Comparisons: Compare previously saved result files
Performance Visualization: Generate charts and reports automatically
Same Data, Different Engines¶
A key BenchBox design principle: benchmark data is generated once as Parquet files and shared across all platforms. When you compare DuckDB, DataFusion, SQLite, and Polars at the same scale factor, every engine reads from the same Parquet dataset. This eliminates data variability and ensures performance differences reflect the engines themselves.
# All three commands read the same generated Parquet files
benchbox run --platform duckdb --benchmark tpch --scale 1 -o results/duckdb.json
benchbox run --platform datafusion --benchmark tpch --scale 1 -o results/datafusion.json
benchbox run --platform polars-df --benchmark tpch --scale 1 -o results/polars.json
# Compare the results
benchbox compare results/duckdb.json results/datafusion.json results/polars.json
Data is generated on the first run and cached in benchmark_runs/<benchmark>/sf<scale>/data/. Subsequent runs at the same scale factor reuse the cached data automatically.
External Table Mode (--table-mode external)¶
Use --table-mode external when you want engines to query staged files directly instead of materializing into native tables:
# Native (default): materialize into platform-managed tables
benchbox run --platform duckdb --benchmark tpch --scale 1 --table-mode native
# External: register views/external tables over staged files
benchbox run --platform duckdb --benchmark tpch --scale 1 --table-mode external
Key rules:
nativeremains the default.externalis incompatible with--tuning tuned.Cloud platforms still upload/stage data in object storage during load; they skip COPY/CTAS materialization in external mode.
Current external-mode mechanisms:
Platform |
External Mechanism |
|---|---|
DuckDB / MotherDuck |
|
DataFusion / Polars |
Direct file scan path ( |
Athena |
External table registration over S3 Parquet (CTAS bypass) |
Snowflake |
External stage + |
BigQuery |
|
Redshift |
Spectrum external schema + external table over S3 |
Databricks |
|
ClickHouse Cloud |
|
Trino / Presto |
|
Azure Synapse |
PolyBase external data source/file format + |
Unsupported: SQLite and PostgreSQL.
Quick Start¶
List Available Platforms¶
# List all available platforms
benchbox compare --list-platforms
Output shows installed SQL and DataFrame platforms:
SQL Platforms:
Installed:
duckdb (embedded, analytical)
sqlite (embedded, transactional)
postgresql (server, transactional)
clickhouse (server, analytical)
Cloud (requires configuration):
snowflake (cloud, analytical)
bigquery (cloud, analytical)
databricks (cloud, analytical)
DataFrame Platforms:
Installed:
polars-df (expression, single_node) [lazy, streaming]
pandas-df (pandas, single_node) [standard]
datafusion-df (expression, single_node) [lazy]
Not installed:
pyspark-df (expression, distributed)
cudf-df (pandas, gpu_accelerated)
Compare SQL Platforms¶
# Compare DuckDB vs SQLite on TPC-H SF 0.01
benchbox compare -p duckdb -p sqlite --scale 0.01
Compare DataFrame Platforms¶
# Compare Polars vs Pandas
benchbox compare -p polars-df -p pandas-df --scale 0.01
Compare Result Files¶
# Compare previously saved results
benchbox compare results/duckdb.json results/sqlite.json
Interactive Mode¶
# Launch interactive wizard
benchbox compare
The interactive wizard guides you through:
Choosing comparison type (platforms or files)
Selecting platforms from an interactive list
Configuring benchmark options
Running the comparison
Command Reference¶
benchbox compare [OPTIONS] [RESULT_FILES]...
Modes¶
Mode is automatically detected from arguments:
Mode |
Trigger |
Description |
|---|---|---|
Run Mode |
|
Run benchmarks across platforms then compare |
File Mode |
|
Compare existing result files |
Interactive |
No arguments |
Launch interactive wizard |
Platform Selection¶
Option |
Description |
|---|---|
|
Platforms to compare (triggers run mode). Repeatable |
|
Platform type (default: auto-detect) |
|
Show available platforms and exit |
Benchmark Configuration (run mode)¶
Option |
Default |
Description |
|---|---|---|
|
|
Benchmark: tpch, tpcds, ssb, clickbench |
|
|
Scale factor |
|
all |
Comma-separated query IDs (e.g., Q1,Q6,Q10) |
|
|
Warmup iterations |
|
|
Benchmark iterations |
|
|
Per-query timeout (seconds) |
|
false |
Run platforms in parallel |
Output Options¶
Option |
Default |
Description |
|---|---|---|
|
none |
Output directory for results |
|
|
Output format |
|
false |
Generate ASCII visualization charts in output directory |
|
|
Chart color theme (with –generate-charts) |
|
auto |
Directory containing benchmark data |
Query Plan Options (file mode)¶
Option |
Default |
Description |
|---|---|---|
|
false |
Include query plan comparison in output |
|
|
Only show plans with similarity below threshold (0.0-1.0) |
Usage Examples¶
SQL Platform Comparison¶
Compare embedded SQL databases:
benchbox compare \
-p duckdb \
-p sqlite \
-p datafusion \
--scale 0.1 \
--iterations 5
Example output:
SQL Platform Comparison
Platforms: duckdb, sqlite, datafusion
Benchmark: tpch @ SF 0.1
Iterations: 5
Running benchmarks...
============================================================
RESULTS
============================================================
Fastest: duckdb
Slowest: sqlite
Platform Geomean (ms) Total (ms) Success
------------------------------------------------------------
duckdb 45.23 1025.34 100%
datafusion 67.89 1538.67 100%
sqlite 234.56 5318.90 100%
Query Winners:
Q1: duckdb
Q2: datafusion
Q3: duckdb
...
============================================================
DataFrame Platform Comparison¶
Compare DataFrame libraries:
benchbox compare \
-p polars-df \
-p pandas-df \
-p datafusion-df \
--scale 0.1
Mixed SQL and DataFrame Comparison¶
Compare across platform types:
benchbox compare \
-p duckdb \
-p polars-df \
--scale 0.1
Note: Mixed comparisons use SQL execution on SQL platforms and DataFrame API on DataFrame platforms. Results compare execution times for equivalent queries.
Cloud Platform Comparison¶
Compare cloud data warehouses (requires credentials):
benchbox compare \
-p snowflake \
-p bigquery \
-p redshift \
--benchmark tpch \
--scale 1 \
--timeout 600
Generate Reports and Charts¶
Save results with visualization:
benchbox compare \
-p duckdb \
-p sqlite \
--scale 1 \
--output ./comparison_results \
--format markdown \
--generate-charts \
--theme dark
Creates:
comparison_results/
├── comparison.md
├── comparison.json
└── charts/
├── performance_bar.txt
├── distribution_box.txt
└── query_heatmap.txt
Compare Result Files¶
Compare previously saved benchmark results:
# Export results from individual runs
benchbox run --platform duckdb --benchmark tpch --scale 0.1 -o results/duckdb
benchbox run --platform sqlite --benchmark tpch --scale 0.1 -o results/sqlite
# Compare the result files
benchbox compare results/duckdb/results.json results/sqlite/results.json
Specific Query Subset¶
Run only specific queries for faster iteration:
benchbox compare \
-p duckdb \
-p sqlite \
--queries Q1,Q6,Q14,Q17 \
--iterations 10
Parallel Execution¶
Run platforms in parallel for faster comparisons:
benchbox compare \
-p duckdb \
-p sqlite \
-p clickhouse \
--parallel \
--iterations 3
JSON Output for Automation¶
Export results for programmatic analysis:
benchbox compare \
-p duckdb \
-p sqlite \
--format json \
--output ./results
The JSON output includes:
{
"config": {
"platform_type": "sql",
"benchmark": "tpch",
"scale_factor": 0.01,
"iterations": 3
},
"results": [
{
"platform": "duckdb",
"platform_type": "sql",
"query_results": [...],
"geometric_mean_ms": 45.23,
"total_time_ms": 1025.34,
"success_rate": 100.0
},
...
],
"summary": {
"platforms": ["duckdb", "sqlite"],
"fastest_platform": "duckdb",
"slowest_platform": "sqlite",
"speedup_ratio": 5.18,
"query_winners": {"Q1": "duckdb", ...}
}
}
Programmatic API¶
For custom analysis, use the Python API directly:
from benchbox.core.comparison import (
UnifiedBenchmarkSuite,
UnifiedBenchmarkConfig,
PlatformType,
run_unified_comparison,
)
Quick Comparison¶
from benchbox.core.comparison import run_unified_comparison
# Run quick comparison
results = run_unified_comparison(
platforms=["duckdb", "sqlite"],
scale_factor=0.01,
)
for result in results:
print(f"{result.platform}: {result.geometric_mean_ms:.2f}ms")
Full Suite with Configuration¶
from benchbox.core.comparison import (
UnifiedBenchmarkSuite,
UnifiedBenchmarkConfig,
PlatformType,
)
# Configure benchmark
config = UnifiedBenchmarkConfig(
platform_type=PlatformType.SQL,
scale_factor=0.1,
benchmark="tpch",
query_ids=["Q1", "Q6", "Q10"],
warmup_iterations=2,
benchmark_iterations=5,
)
# Create suite
suite = UnifiedBenchmarkSuite(config=config)
# Run comparison
results = suite.run_comparison(platforms=["duckdb", "sqlite"])
# Get summary
summary = suite.get_summary(results)
print(f"Fastest: {summary.fastest_platform}")
print(f"Speedup ratio: {summary.speedup_ratio:.2f}x")
Generate Charts Programmatically¶
from benchbox.core.visualization import ResultPlotter, export_ascii
from benchbox.core.visualization.ascii.base import ASCIIChartOptions
# Load results from JSON files
plotter = ResultPlotter.from_sources(["results/duckdb.json", "results/sqlite.json"])
# Render individual chart types
from benchbox.core.visualization.ascii import ASCIIBarChart
from benchbox.core.visualization.ascii.bar_chart import BarData
bar_data = [BarData(label=r.platform, value=r.total_time_ms or 0) for r in plotter.results]
chart = ASCIIBarChart(data=bar_data, title="Performance Comparison")
print(chart.render())
# Or use the CLI for automatic chart generation
# benchbox visualize results/duckdb.json results/sqlite.json
Platform Type Detection¶
from benchbox.core.comparison import detect_platform_type, detect_platform_types
# Single platform
platform_type = detect_platform_type("polars-df")
# Returns: PlatformType.DATAFRAME
# Multiple platforms
detected, inconsistent = detect_platform_types(["duckdb", "sqlite", "polars-df"])
# detected: PlatformType.SQL (majority)
# inconsistent: ["polars-df"]
Platform Categories¶
SQL Platforms¶
Category |
Platforms |
Use Case |
|---|---|---|
Embedded |
DuckDB, SQLite, DataFusion |
Local analysis, testing |
Server |
PostgreSQL, ClickHouse |
Production workloads |
Cloud |
Snowflake, BigQuery, Redshift, Databricks |
Enterprise, large scale |
DataFrame Platforms¶
Category |
Platforms |
Use Case |
|---|---|---|
Single Node |
Polars, Pandas, DataFusion |
In-memory, medium datasets |
Distributed |
PySpark, Dask, Modin |
Large datasets, cluster |
GPU Accelerated |
cuDF |
CUDA GPU acceleration |
Best Practices¶
1. Start Small¶
Begin with SF 0.01 to verify everything works:
benchbox compare -p duckdb -p sqlite --scale 0.01
2. Use Multiple Iterations¶
Reduce variance with multiple iterations:
benchbox compare \
-p duckdb \
-p sqlite \
--iterations 5 \
--warmup 2
3. Match Your Production Scale¶
Data Size |
Recommended SF |
|---|---|
< 1 GB |
0.01 - 0.1 |
1-10 GB |
0.1 - 1 |
10-100 GB |
1 - 10 |
> 100 GB |
10+ |
4. Compare Similar Platforms¶
For meaningful comparisons:
Compare platforms with similar deployment models
Use same hardware/resources
Run at the same scale factor
5. Document Your Results¶
Use markdown output for documentation:
benchbox compare \
-p duckdb \
-p sqlite \
--scale 1 \
--format markdown \
--output ./docs/benchmark_results
Interpreting Results¶
Geometric Mean¶
The geometric mean summarizes performance across all queries. Lower is better. It’s less sensitive to outliers than arithmetic mean.
Success Rate¶
Percentage of queries that completed successfully. 100% expected for production-ready platforms.
Query Winners¶
Shows which platform was fastest for each query:
Different platforms excel at different query types
Aggregation vs joins vs filtering
Use for workload-specific decisions
Speedup Ratio¶
Ratio between fastest and slowest platform:
> 1.0: Faster platform completed in less time~1.0: Platforms are approximately equalLarge ratios (5x+) indicate major performance differences
Troubleshooting¶
Data Not Found¶
Error: Data directory not found: benchmark_runs/tpch/sf001/data
Generate data first:
benchbox run --platform duckdb --benchmark tpch --scale 0.01
Platform Not Available¶
Warning: Platform clickhouse not available
Install the required extra:
uv add benchbox --extra clickhouse
Mixed Platform Types¶
Error: Cannot mix SQL and DataFrame platforms. Use --type to specify.
Either:
Compare only same-type platforms
Explicitly set
--type sqlor--type dataframe
Memory Issues at Large Scale¶
For large scale factors:
Use streaming-capable platforms (Polars, DuckDB)
Reduce concurrent platforms compared
Use
--parallel=falsefor sequential execution
Query Plan Comparison¶
When comparing result files, you can include query plan analysis to understand why performance changed, not just that it changed.
Include Plan Analysis¶
# Compare files with plan analysis
benchbox compare baseline.json current.json --include-plans
# Only show plans that changed significantly (< 90% similar)
benchbox compare baseline.json current.json --include-plans --plan-threshold 0.9
Plan Comparison Output¶
The plan comparison section shows:
Column |
Description |
|---|---|
Similarity |
Overall plan similarity (0-100%) |
Type Δ |
Number of operator type mismatches |
Prop Δ |
Number of property mismatches |
Perf Δ |
Performance change (positive = slower) |
Status |
Plan status (Identical, Similar, Different, REGRESSION) |
Requirements¶
Plan comparison requires benchmark results captured with --capture-plans:
# Capture plans during benchmark runs
benchbox run --platform duckdb --benchmark tpch --capture-plans -o baseline
benchbox run --platform duckdb --benchmark tpch --capture-plans -o current
# Compare with plan analysis
benchbox compare baseline/results.json current/results.json --include-plans