Query Plan Analysis¶
BenchBox supports capturing, analyzing, and comparing query execution plans across different database platforms. This feature enables:
Cross-platform comparison: Understand how different databases optimize the same query
Regression detection: Track plan changes between software versions
Optimization analysis: Identify query optimization opportunities
Table of Contents¶
Quick Start¶
# 1. Run benchmark with plan capture
benchbox run --platform duckdb --benchmark tpch --scale 1 --capture-plans
# 2. View a specific plan
benchbox show-plan --run benchmark_runs/latest/results.json --query-id q05
# 3. Compare plans between two runs
benchbox compare-plans \
--run1 run_before.json \
--run2 run_after.json \
--query-id q05
Capturing Query Plans¶
Basic Capture¶
Add the --capture-plans flag to any benchmark run:
benchbox run \
--platform duckdb \
--benchmark tpch \
--scale 1 \
--capture-plans
This captures the logical query plan for each query executed during the benchmark.
Supported Platforms¶
Currently supported platforms for query plan capture:
Platform |
Parser Status |
EXPLAIN Format |
Notes |
|---|---|---|---|
DuckDB |
✓ Stable |
Text (box) |
Unicode box-drawing characters |
SQLite |
✓ Stable |
Text (tree) |
Simple tree format |
PostgreSQL |
✓ Stable |
JSON |
Requires |
Redshift |
✓ Beta |
Text |
Supports XN prefixed operators |
DataFusion |
✓ Beta |
Text (indent) |
Physical plan operators |
Plans are captured using platform-specific EXPLAIN commands and parsed into a unified logical representation.
Performance Impact¶
Plan capture has minimal performance impact:
Adds ~10-50ms per query
Does not affect benchmark timing measurements
Failed plan captures are logged but don’t halt execution
Viewing Plans¶
Tree View (Default)¶
Display a plan as an ASCII tree:
benchbox show-plan \
--run results.json \
--query-id q05
Output example:
Query Plan: q05
Platform: duckdb
Cost: 500.25 | Rows: 50
└── Aggregate (aggs=[COUNT(*), SUM(o_totalprice)])
└── Join (type=inner)
├── Filter (filter='o_orderdate > '2023-01-01'')
│ └── Scan (table=orders)
└── Scan (table=customer)
Summary View¶
Show statistics without the full tree:
benchbox show-plan \
--run results.json \
--query-id q05 \
--format summary
Output example:
Query: q05 (duckdb)
Total Operators: 5
Max Depth: 3
Estimated Cost: 500.25
Estimated Rows: 50
Operator Breakdown:
Scan: 2
Filter: 1
Join: 1
Aggregate: 1
JSON Export¶
Export plan for programmatic analysis:
benchbox show-plan \
--run results.json \
--query-id q05 \
--format json > plan_q05.json
Visualization Options¶
Control tree display:
# Compact view without operator properties
benchbox show-plan --run results.json --query-id q05 --compact --no-properties
# Limit tree depth for very complex plans
benchbox show-plan --run results.json --query-id q05 --max-depth 3
Comparing Plans¶
Compare Single Query¶
Compare the same query between two benchmark runs:
benchbox compare-plans \
--run1 results_duckdb.json \
--run2 results_datafusion.json \
--query-id q05
Output example:
================================================================================
QUERY PLAN COMPARISON
================================================================================
Left: q05 (duckdb)
Right: q05 (datafusion)
Plans are very similar (85.3% similarity)
Similarity Metrics:
Overall: 85.3%
Structural: 100.0%
Operator: 100.0%
Property: 66.7%
Operators: 5 (left) vs 5 (right)
Matching: 4
Property mismatches: 1
Property Differences (1):
• Join type: inner ≠ hash_join
================================================================================
Compare All Queries¶
Compare all queries from two runs:
benchbox compare-plans \
--run1 before_optimization.json \
--run2 after_optimization.json
Output example:
┌───────┬────────────┬──────────┬──────────┬─────────────┬─────────────────────┐
│ Query │ Similarity │ Type Diff│ Prop Diff│ Struct Diff │ Status │
├───────┼────────────┼──────────┼──────────┼─────────────┼─────────────────────┤
│ q01 │ 98.5% │ - │ 1 │ - │ ✓ Nearly Identical │
│ q02 │ 100.0% │ - │ - │ - │ ✓ Nearly Identical │
│ q03 │ 87.2% │ 1 │ 2 │ - │ ≈ Very Similar │
│ q05 │ 45.8% │ 5 │ 3 │ 2 │ ✗ Different │
└───────┴────────────┴──────────┴──────────┴─────────────┴─────────────────────┘
Summary: 4 queries compared
Nearly Identical (≥95%): 2
Very Similar (75-95%): 1
Different (<50%): 1
Regression Detection¶
Show only queries with significant plan changes:
benchbox compare-plans \
--run1 version_1.2.json \
--run2 version_1.3.json \
--threshold 0.9
This shows only queries with <90% similarity, helping identify potential regressions.
JSON Export¶
Export comparison results for further analysis:
benchbox compare-plans \
--run1 run_a.json \
--run2 run_b.json \
--output json > comparison_results.json
Understanding Plan Differences¶
Similarity Metrics¶
The comparison engine provides four similarity scores:
Overall Similarity (0-100%)
Weighted average of all metrics
Formula: 40% structural + 40% operator + 20% property
Best indicator of plan similarity
Structural Similarity
Measures tree structure matching
Counts operators at each level
100% = same number of operators at each level
Operator Similarity
Measures operator type matching
Compares Scan, Join, Filter, Aggregate, etc.
100% = all operators have matching types
Property Similarity
Measures property matching when types match
Compares table names, join types, filters, etc.
100% = all properties identical
Difference Types¶
Type Mismatches
Different operator types at same position
Example:
ScanvsIndexScanOften indicates algorithmic differences
Property Mismatches
Same operator type, different properties
Example:
INNER JOINvsLEFT JOINUsually indicates optimizer choices
Structure Mismatches
Different tree structure
Example: Different number of children
Indicates major plan reorganization
Interpretation Guide¶
Similarity |
Interpretation |
Common Causes |
|---|---|---|
≥95% |
Nearly Identical |
Minor property changes, equivalent optimizations |
75-95% |
Very Similar |
Different join orders, equivalent algorithms |
50-75% |
Somewhat Similar |
Different optimization strategies, same query |
<50% |
Different |
Major algorithmic differences, possibly different queries |
Programmatic Usage¶
Python API¶
Use query plan models and comparison programmatically:
from benchbox.core.results.models import BenchmarkResults
from benchbox.core.query_plans.comparison import compare_query_plans
from benchbox.core.query_plans.visualization import render_plan
# Load results
with open('results.json') as f:
results = BenchmarkResults.from_dict(json.load(f))
# Get a query execution
query_exec = results.phases['power'].queries[0]
plan = query_exec.query_plan
# Render plan
print(render_plan(plan))
# Compare two plans
comparison = compare_query_plans(plan1, plan2)
print(f"Similarity: {comparison.similarity.overall_similarity:.1%}")
print(f"Type mismatches: {comparison.similarity.type_mismatches}")
Custom Analysis¶
Traverse plan trees programmatically:
def count_scans(plan):
"""Count total scan operations in plan."""
def count_in_operator(op):
count = 1 if op.operator_type == LogicalOperatorType.SCAN else 0
if op.children:
for child in op.children:
count += count_in_operator(child)
return count
return count_in_operator(plan.logical_root)
# Analyze plans
num_scans = count_scans(query_exec.query_plan)
print(f"Total scans: {num_scans}")
Plan Fingerprints¶
Use fingerprints for fast plan comparison:
# Check if plans are identical
if plan1.plan_fingerprint == plan2.plan_fingerprint:
print("Plans are identical")
else:
print("Plans differ")
# Group queries by plan
plans_by_fingerprint = {}
for query_exec in all_queries:
fp = query_exec.query_plan.plan_fingerprint
if fp not in plans_by_fingerprint:
plans_by_fingerprint[fp] = []
plans_by_fingerprint[fp].append(query_exec.query_id)
# Find queries with same plan
for fp, query_ids in plans_by_fingerprint.items():
if len(query_ids) > 1:
print(f"Queries {query_ids} share same plan")
Troubleshooting¶
Plan Not Captured¶
Symptom: Warning message “No query plan captured for query”
Causes:
Forgot
--capture-plansflag during benchmarkPlatform doesn’t support plan capture
Parser error (check logs for details)
Solution:
# Ensure --capture-plans is included
benchbox run --platform duckdb --benchmark tpch --scale 1 --capture-plans
# Check which platforms support capture
benchbox platforms
Parser Errors¶
Symptom: Plan capture succeeds but plan is None
Causes:
EXPLAIN output format changed in newer platform version
Complex query with unusual operators
Platform-specific EXPLAIN extensions
Solution:
Check benchmark logs for detailed error messages
File an issue with the EXPLAIN output for investigation
Plan capture failures don’t halt benchmark execution
Performance Issues¶
Symptom: Benchmark runs much slower with --capture-plans
Expected Impact: 10-50ms overhead per query (negligible for most workloads)
If significantly slower:
Check if disk I/O is bottleneck (plan serialization)
Verify platform EXPLAIN performance
Consider capturing plans for subset of queries during development
Memory Usage¶
Symptom: High memory usage with plan capture
Typical Plan Size: 1-10 KB per query in memory, 10-100 KB serialized
For large benchmarks (TPC-DS 99 queries):
Memory: ~10 MB for all plans
Disk: ~10 MB added to results JSON
If memory is constrained, consider running with --phases power to capture fewer queries.
Comparison Shows No Differences¶
Symptom: Plans appear different visually but comparison shows 100% similarity
Cause: Comparison ignores non-structural properties like:
Operator IDs (internal identifiers)
Cost estimates (platform-specific)
Row count estimates
This is intentional - comparison focuses on logical plan structure, not execution details.
To compare costs/estimates, examine the JSON export directly.
Best Practices¶
Development Workflow¶
Capture baseline: Run benchmark with
--capture-plansand save resultsMake changes: Modify queries, update database, change configuration
Capture new run: Run same benchmark again with
--capture-plansCompare: Use
benchbox compare-plansto identify changesInvestigate: For significant differences, use
show-planto inspect details
Cross-Platform Analysis¶
# Run same benchmark on different platforms
benchbox run --platform duckdb --benchmark tpch --scale 1 --capture-plans
benchbox run --platform datafusion --benchmark tpch --scale 1 --capture-plans
# Compare plans
benchbox compare-plans \
--run1 benchmark_runs/duckdb_*/results.json \
--run2 benchmark_runs/datafusion_*/results.json
# Focus on interesting queries
benchbox compare-plans \
--run1 benchmark_runs/duckdb_*/results.json \
--run2 benchmark_runs/datafusion_*/results.json \
--query-id q05
Regression Testing¶
# Automated regression check
benchbox compare-plans \
--run1 baseline.json \
--run2 current.json \
--threshold 0.95 \
--output json > regression_report.json
# Check exit code
if [ $? -ne 0 ]; then
echo "Plan regressions detected!"
exit 1
fi
Advanced Topics¶
Plan Fingerprints¶
Plans are fingerprinted using SHA256 of the logical structure:
Included: Operator types, table names, join types, filter expressions, aggregations
Excluded: Operator IDs, costs, row estimates, physical operator details
Identical fingerprints guarantee identical logical plans.
Comparison Algorithm¶
The comparison engine uses:
Fast path: SHA256 fingerprint comparison (O(1))
Full comparison: BFS tree traversal (O(n×m) where n, m = tree sizes)
Similarity scoring: Multi-dimensional metrics based on operator matching
Platform-Specific Notes¶
DuckDB:
Uses
EXPLAINoutput (text format with box-drawing characters)Captures logical and physical operators
Fast EXPLAIN execution (<1ms typically)
Requires UTF-8 terminal for proper display
SQLite:
Uses
EXPLAIN QUERY PLAN(text format)Simpler output than DuckDB
Limited cost information
PostgreSQL:
Uses
EXPLAIN (FORMAT JSON)for machine-readable outputProvides detailed cost estimates, row counts, and operator properties
Supports all PostgreSQL node types (Seq Scan, Index Scan, Hash Join, etc.)
Requires PostgreSQL 12+ for full JSON format support
Redshift:
Uses text-based
EXPLAINoutputOperators prefixed with “XN” (e.g., XN Seq Scan, XN Hash Join)
Includes distribution operators (DS_DIST_INNER, DS_BCAST_INNER, etc.)
Cost and row estimates parsed from output
DataFusion:
Uses indentation-based text format from
EXPLAINPrefers physical plan (operators ending in “Exec”) over logical plan
Supports EXPLAIN ANALYZE metrics (output_rows, elapsed_compute, etc.)
Common operators: ProjectionExec, FilterExec, HashJoinExec, AggregateExec
Further Reading¶
API Documentation - Programmatic usage
Platform Guide - Platform-specific details
TPC-H Benchmark Guide - Query plan analysis examples
Support¶
For issues or questions:
Check logs in
benchmark_runs/for detailed error messages