Understanding Benchmark Results¶
Learn how to interpret BenchBox output and what the metrics mean.
Result Overview¶
After running a benchmark, BenchBox produces:
Console summary - Quick overview of timing and validation
JSON results - Detailed metrics for each query
Manifest files - Data generation metadata
Viewing Results¶
# Show latest result summary
benchbox results --limit 1
# Export detailed JSON
benchbox export --last --format json -o results.json
Key Metrics¶
Power Test Timing¶
Metric |
Description |
|---|---|
|
End-to-end benchmark duration |
|
Time to generate TPC data |
|
Time to load data into tables |
|
Total query execution time |
Per-Query Metrics¶
Each query result includes:
{
"query_id": "Q1",
"execution_time_ms": 156.4,
"rows_returned": 4,
"status": "SUCCESS",
"validation": {
"expected_rows": 4,
"actual_rows": 4,
"status": "PASS"
}
}
Field |
Meaning |
|---|---|
|
Query runtime in milliseconds |
|
Number of result rows |
|
SUCCESS, FAILED, or TIMEOUT |
|
PASS if row count matches expected |
Understanding Validation¶
BenchBox validates query correctness by comparing row counts:
Status |
Meaning |
Action |
|---|---|---|
|
Row count matches expected |
No action needed |
|
Row count differs from expected |
Check query translation |
|
No expected value available |
Normal for some queries |
TPC Metrics¶
For TPC-H/TPC-DS, BenchBox calculates official metrics:
QphH (TPC-H) / QphDS (TPC-DS)¶
The Queries per Hour metric measures throughput:
QphH = (SF × 22 × 3600) / T_power
Where:
SF= Scale Factor (0.01, 0.1, 1, 10, etc.)22= Number of TPC-H queriesT_power= Power test time in seconds
Higher is better. Compare QphH only at the same scale factor.
Price/Performance¶
Not calculated by BenchBox (requires cost data), but you can derive it:
Price/Performance = (Platform Cost) / QphH
Comparing Results¶
# Run on two platforms
benchbox run --platform duckdb --benchmark tpch -o duckdb.json
benchbox run --platform sqlite --benchmark tpch -o sqlite.json
# Compare results
benchbox compare duckdb.json sqlite.json
The comparison shows:
Per-query timing differences
Relative speedup/slowdown
Validation status alignment
Result File Locations¶
BenchBox stores results in:
benchmark_runs/
├── results/ # JSON result files
│ ├── tpch_duckdb_sf0.01_*.json
│ └── tpch_sqlite_sf0.01_*.json
├── datagen/ # Generated data
│ └── tpch_sf0.01/
│ ├── lineitem.csv
│ ├── orders.csv
│ └── ...
└── manifests/ # Data generation metadata
└── tpch_sf0.01_manifest.json
Interpreting Slow Queries¶
If a query is unexpectedly slow:
Check scale factor - Larger data takes longer
Review query plan - Use platform’s EXPLAIN
Compare baselines - Run on DuckDB for reference
Check validation - Ensure correct results
# Export query SQL for analysis
benchbox run --dry-run ./analysis --platform duckdb --benchmark tpch
# Queries are in ./analysis/queries/
Next Steps¶
Comparing Platforms - Run on multiple databases
Result Export Documentation - Full JSON schema
Performance Guide - Optimization tips