Databricks Liquid Clustering¶
BenchBox supports first-class Databricks clustering strategy control so you can run reproducible comparisons across:
liquid_clusteringz_ordernone
Tuning Configuration¶
Use unified tuning platform_optimizations:
platform_optimizations:
databricks_clustering_strategy: liquid_clustering # liquid_clustering | z_order | none
liquid_clustering_enabled: true
liquid_clustering_columns:
- event_time
- customer_id
The z_ordering_enabled configuration is also supported.
Precedence Rules¶
BenchBox resolves Databricks strategy with explicit precedence:
liquid_clustering_enabledor non-emptyliquid_clustering_columnsdatabricks_clustering_strategyz_ordering_enabled
If both liquid clustering and Z-ORDER settings are supplied, liquid clustering wins.
CLI Overrides¶
You can override the strategy at runtime via --platform-option.
benchbox run \
--platform databricks \
--benchmark tpch \
--scale 1 \
--tuning ./databricks-tuning.yaml \
--platform-option databricks_clustering_strategy=liquid_clustering \
--platform-option liquid_clustering_columns=event_time,customer_id
Switching from Z-ORDER to Liquid Clustering¶
Keep existing Z-ORDER configs unchanged if you need strict historical comparability.
Add
databricks_clustering_strategy: liquid_clusteringin a new config variant for A/B runs.Pin explicit
liquid_clustering_columnsto avoid accidental drift between runs.
A/B Comparison With Z-ORDER¶
Z-ORDER run:
benchbox run --platform databricks --benchmark tpch --scale 1 --tuning ./databricks-zorder.yaml
Liquid clustering run:
benchbox run --platform databricks --benchmark tpch --scale 1 --tuning ./databricks-liquid.yaml
Compare outputs:
benchbox compare <zorder.json> <liquid.json>
Result Metadata¶
Databricks platform metadata includes:
databricks_clustering_strategyliquid_clustering_enabledliquid_clustering_columns_configliquid_clustering_operationsz_order_operations