PrestoDB Platform Guide¶
PrestoDB is Meta’s distributed SQL query engine (the original Presto project). BenchBox provides a dedicated PrestoDB adapter that intentionally diverges from the Trino adapter to respect protocol, driver, and dialect differences between the two forks.
Overview¶
Type: Distributed SQL query engine
Common Use Cases: Legacy Presto clusters, Meta ecosystem deployments, Presto-compatible managed services
BenchBox Extra: uv add benchbox --extra presto
Driver-only install: python -m pip install presto-python-client
SQL Dialect Notes (SQLGlot)¶
PrestoDB uses the
prestodialect identifier (adapter.get_target_dialect()returns"presto").Function differences vs Trino include:
LOGargument order (base-first), distinct from Trino’s overrideTRIM,JSON_QUERY, andLISTAGGparsingArray and aggregation transforms (e.g.,
ArraySum,GroupConcat,Merge) use Presto-specific rewrites
Use this adapter when you need PrestoDB-specific query rendering; use the Trino adapter for Trino/Starburst workloads.
Quick Start¶
Install the Presto extra (includes presto-python-client and cloudpathlib):
uv add benchbox --extra presto
from benchbox.platforms.presto import PrestoAdapter
from benchbox import TPCH
adapter = PrestoAdapter(
host="presto-coordinator.example.com",
port=8080,
catalog="hive",
schema="default",
username="presto",
# password="secret", # Optional basic auth
)
benchmark = TPCH(scale_factor=1.0)
results = benchmark.run_with_platform(adapter)
print(f"Completed in {results.duration_seconds:.2f}s")
CLI Usage¶
# Minimal PrestoDB run (assumes presto-python-client installed)
benchbox run --platform presto --benchmark tpch --scale 1.0 \
--host presto-coordinator.example.com \
--catalog hive --schema default --username presto
Configuration Highlights¶
Connection:
host,port(default8080),catalog,schemaAuthentication: Basic auth via
presto-python-client(--username/--password); Kerberos can be layered later if neededProtocol:
http_schemeauto-selectshttpswhen a password is provided; SSL verification is configurable viaverify_sslorssl_cert_pathDialect: Hard-coded to
"presto"to avoid Trino behaviorSession Properties: Can be set via
session_propertiesmapping (applied after connect withSET SESSION)Data Loading: Uses batch
INSERTstatements with small batches tuned for PrestoDB; supports memory or Hive catalogs
When to Choose PrestoDB vs Trino vs Athena¶
PrestoDB: Existing Presto deployments, Meta ecosystem, or environments pinned to Presto-only features.
Trino: Recommended default for modern clusters and Starburst Enterprise.
Athena: AWS-managed Presto/Trino lineage with serverless execution.