Primitives Query Catalog¶
The Read Primitives benchmark queries now live outside Python source in
benchbox/core/primitives/catalog/queries.yaml. This keeps the runtime module
small and makes it easier to audit or extend the workload.
File layout¶
version: 1
queries:
- id: aggregation_distinct
category: aggregation
sql: |-
-- Distinct count of high cardinality key on a large table
SELECT ...
id– unique identifier referenced by benchmarks and tests.category– lower-case grouping used forget_queries_by_category; omit to default to theidprefix.sql– literal SQL string (keep the leading newline so comparisons remain stable).description(optional) – free-form text for future tooling.variants(optional) – dict mapping dialect names to platform-specific SQL.skip_on(optional) – list of dialects where this query should be skipped.
Platform-Specific Variants¶
Some queries require dialect-specific SQL due to incompatible syntax or missing features. The catalog supports two approaches:
Query Variants¶
Define alternative SQL for specific platforms using the variants field:
- id: json_aggregates
category: json
sql: |-
-- Standard SQL version (MySQL syntax)
SELECT
JSON_ARRAYAGG(p_name) as part_names,
JSON_OBJECTAGG(p_partkey, p_retailprice) as part_prices
FROM part
LIMIT 100
variants:
duckdb: |-
-- DuckDB uses different function names
SELECT
JSON_GROUP_ARRAY(p_name) as part_names,
JSON_GROUP_OBJECT(p_partkey, p_retailprice) as part_prices
FROM part
LIMIT 100
When to use variants:
Platform uses different function names (e.g.,
JSON_ARRAYAGGvsJSON_GROUP_ARRAY)Platform requires different syntax for same functionality (e.g.,
MATCH...AGAINSTvsLIKE)Query needs structural changes but tests same capability (e.g., CTE to avoid nested aggregates)
How it works:
When
get_query(query_id, dialect="duckdb")is called, the query manager returns the DuckDB variant if it existsIf no variant exists for the requested dialect, returns the base
sqlqueryVariant SQL goes through the same translation pipeline as base queries
Skipping Queries¶
Some queries cannot be meaningfully tested on certain platforms due to data limitations or fundamental incompatibilities. Use skip_on to explicitly skip:
- id: json_extract_simple
category: json
sql: |-
SELECT
o_orderkey,
JSON_EXTRACT(o_comment, '$.priority') as order_priority
FROM orders
WHERE JSON_EXTRACT(o_comment, '$.priority') IS NOT NULL
skip_on: [duckdb]
description: "TPC-H o_comment contains plain text, not valid JSON"
When to use skip_on:
Data quality issues make query meaningless (e.g., JSON functions on non-JSON data)
Platform fundamentally lacks required feature and no reasonable alternative exists
Query would always fail due to platform limitations, not bugs
How it works:
When
get_queries(dialect="duckdb")is called, queries withskip_on: [duckdb]are excludedget_query(query_id, dialect="duckdb")raisesValueErrorif query is in skip_on listBenchmark treats this as expected behavior, not a failure
Best Practices¶
Prefer variants over skip_on:
Only skip when no reasonable alternative exists
Most syntax differences can be handled with variants
Keep variants semantically equivalent:
Variants should test the same database capability
Results may differ slightly, but behavior should be comparable
Document why variant is needed in comments
Dialect names:
Use lowercase:
duckdb,bigquery,snowflake,postgresMust match
dialectparameter passed toget_queries()Case-insensitive matching during query retrieval
Validation:
Run benchmark with variant to ensure it works:
benchbox run --platform duckdb --benchmark read_primitives --scale 1Check success rate in results
Unit tests automatically validate catalog structure
Example: Complex Variant (timeseries_trend_analysis)¶
- id: timeseries_trend_analysis
category: timeseries
sql: |-
-- Standard SQL with nested aggregate in REGR_SLOPE
SELECT
DATE_TRUNC('month', o_orderdate) as order_month,
COUNT(*) as order_count,
REGR_SLOPE(SUM(o_totalprice), EXTRACT(EPOCH FROM o_orderdate)) as trend
FROM orders
GROUP BY DATE_TRUNC('month', o_orderdate)
variants:
duckdb: |-
-- DuckDB doesn't allow nested aggregates, use CTE + window function
WITH monthly_totals AS (
SELECT
DATE_TRUNC('month', o_orderdate) as order_month,
COUNT(*) as order_count,
SUM(o_totalprice) as monthly_revenue,
EXTRACT(EPOCH FROM DATE_TRUNC('month', o_orderdate)) as month_epoch
FROM orders
GROUP BY DATE_TRUNC('month', o_orderdate)
)
SELECT
order_month,
order_count,
monthly_revenue,
REGR_SLOPE(monthly_revenue, month_epoch) OVER () as trend
FROM monthly_totals
ORDER BY order_month
This example shows:
Structural changes (base query → CTE with window function)
Same capability tested (linear regression on time series data)
Clear comment explaining why variant is needed
Editing workflow¶
Update
queries.yaml, keeping entries ordered and grouped by category. The file is packaged with the library, so every change must live under source control.Validate the catalog locally:
uv run -- python - <<'PY' from benchbox.core.primitives.queries import PrimitivesQueryManager manager = PrimitivesQueryManager() print(f"queries: {len(manager.get_all_queries())}") print(f"categories: {manager.get_query_categories()}") PY uv run -- python -m pytest tests/unit/primitives/test_query_catalog.py
The test suite exercises catalog parsing, duplicate detection, and category filters.
Run the full test suite (
uv run -- python -m pytest) before committing.
Adding many queries¶
When bulk-loading SQL from external sources, prefer authoring a small helper
script under _project/ that outputs the YAML structure (see existing history
for examples). Temporary tooling should stay in _project/ and be removed or
ignored once the catalog has been updated.
Attribution and Lineage¶
Query Sources¶
The Read Primitives catalog combines queries from multiple sources:
1. Apache Impala targeted-perf (majority of queries)¶
Source: https://github.com/apache/impala/tree/master/testdata/workloads/targeted-perf
License: Apache License 2.0
Original format: ~47 .test files covering database primitives
Categories covered: aggregation, broadcast joins, filters, groupby, exchange, orderby, shuffle, string operations
2. Optimizer Sniff Tests (13 optimizer_* queries)¶
Author: Justin Jaffray
Source: https://buttondown.com/jaffray/archive/a-sniff-test-for-some-query-optimizers/
Purpose: Systematic testing of query optimizer capabilities
Tests covered: subquery decorrelation, predicate pushdown, join reordering, constant folding, etc.
Evolution¶
Apache Impala targeted-perf + Optimizer Sniff Tests → BenchBox Read Primitives
Changes from source materials:
Format conversion: Impala
.testformat + blog concepts → unified YAML catalogNaming standardization: Dropped
primitive_prefix for cleaner query IDsQuery modernization: Added SQL:2016+ features (window functions, QUALIFY, MIN_BY/MAX_BY)
Category expansion: Added JSON, full-text, statistical, OLAP categories
Documentation: Enhanced with inline comments explaining test purposes
Optimizer formalization: Converted blog post concepts into systematic 13-query test suite
Licenses¶
Impala-derived content: Apache License 2.0 (Apache Software Foundation)
Optimizer queries: Based on publicly shared concepts (Justin Jaffray)
BenchBox additions: MIT License (Copyright 2026 Joe Harris / BenchBox Project)