Starburst Platform¶
Starburst Galaxy is a managed Trino service providing serverless distributed SQL query execution. BenchBox provides first-class Starburst support, inheriting Trino’s SQL dialect and federated query capabilities.
Features¶
Managed Trino - No cluster management required
Trino dialect - Inherits SQL dialect from Trino
Federated queries - Query data across multiple sources
Built-in catalogs - Pre-configured data connectors
HTTPS secure - Always encrypted connections
Multiple table formats - Iceberg, Hive, Delta Lake support
Quick Start¶
# Install Trino driver
uv add trino
# Set credentials
export STARBURST_HOST=my-cluster.trino.galaxy.starburst.io
export STARBURST_USER=joe@example.com/accountadmin
export STARBURST_PASSWORD=your-password
# Run benchmark
benchbox run --platform starburst --benchmark tpch --scale 1.0
Authentication¶
Starburst Galaxy uses a unique username format that combines your email with a role:
username = email/role
# Example: joe@example.com/accountadmin
Configuration Methods¶
Environment Variables (recommended):
export STARBURST_HOST=my-cluster.trino.galaxy.starburst.io
export STARBURST_USER=joe@example.com/accountadmin
export STARBURST_PASSWORD=your-password
# Optional: separate role configuration
export STARBURST_USER=joe@example.com
export STARBURST_ROLE=accountadmin # Appended automatically
# Optional: default catalog
export STARBURST_CATALOG=tpch_sf1
benchbox run --platform starburst --benchmark tpch --scale 1.0
CLI Options:
benchbox run --platform starburst --benchmark tpch --scale 1.0 \
--platform-option host=my-cluster.trino.galaxy.starburst.io \
--platform-option username=joe@example.com/accountadmin \
--platform-option password=your-password \
--platform-option catalog=tpch_sf1
Configuration Options¶
Option |
Environment Variable |
Required |
Default |
Description |
|---|---|---|---|---|
|
|
Yes |
- |
Galaxy cluster hostname |
|
|
Yes |
- |
User email or email/role |
|
|
Yes |
- |
Password or API key |
|
|
No |
- |
Role (appended to username if not included) |
|
|
No |
- |
Default catalog |
|
|
No |
|
HTTPS port |
|
- |
No |
|
Default schema |
|
- |
No |
|
Table format: |
|
- |
No |
|
SSL certificate verification |
Usage Examples¶
Basic Benchmark¶
# TPC-H at scale factor 1
benchbox run --platform starburst --benchmark tpch --scale 1.0
# TPC-DS at scale factor 10
benchbox run --platform starburst --benchmark tpcds --scale 10.0
With Specific Catalog¶
benchbox run --platform starburst --benchmark tpch --scale 1.0 \
--platform-option catalog=my_catalog \
--platform-option schema=benchmark_data
With Table Format¶
# Use Iceberg tables (default)
benchbox run --platform starburst --benchmark tpch --scale 1.0 \
--platform-option table_format=iceberg
# Use Delta Lake tables
benchbox run --platform starburst --benchmark tpch --scale 1.0 \
--platform-option table_format=delta
Python API¶
from benchbox import TPCH
from benchbox.platforms.starburst import StarburstAdapter
# Initialize adapter
adapter = StarburstAdapter(
host="my-cluster.trino.galaxy.starburst.io",
username="joe@example.com/accountadmin",
password="your-password",
catalog="tpch_catalog",
schema="benchmark_data",
table_format="iceberg",
)
# Load and run benchmark
benchmark = TPCH(scale_factor=1.0)
benchmark.generate_data()
adapter.load_benchmark(benchmark)
results = adapter.run_benchmark(benchmark)
Architecture¶
Starburst inherits from Trino, which means:
SQL Dialect: Uses Trino’s SQL dialect for query translation
Connector Syntax: Same catalog.schema.table naming convention
Session Properties: Trino session properties are supported
from benchbox.core.platform_registry import PlatformRegistry
# Check platform family
family = PlatformRegistry.get_platform_family("starburst")
# Returns: "trino"
# Check inheritance
parent = PlatformRegistry.get_inherited_platform("starburst")
# Returns: "trino"
Table Formats¶
Starburst Galaxy supports multiple table formats:
Format |
Description |
Use Case |
|---|---|---|
|
In-memory tables |
Fast testing, small data |
|
Hive format |
Compatibility with Hive ecosystem |
|
Apache Iceberg |
Production analytics, ACID transactions |
|
Delta Lake |
Databricks ecosystem integration |
# Iceberg (recommended for analytics)
benchbox run --platform starburst --benchmark tpch --scale 1.0 \
--platform-option table_format=iceberg
Comparison: Starburst vs Trino¶
Feature |
Starburst Galaxy |
Self-Hosted Trino |
|---|---|---|
Deployment |
Cloud managed |
Self-hosted cluster |
Authentication |
Password/API key |
Configurable |
SSL |
Always HTTPS |
Configurable |
Scaling |
Automatic |
Manual |
Catalogs |
Pre-configured |
Manual setup |
Cost |
Pay-per-use |
Infrastructure cost |
Best For |
Quick start, production |
Full control, customization |
When to Use Starburst¶
Use Starburst when:
You need managed Trino without cluster management
Running federated queries across multiple data sources
Using Iceberg/Delta Lake table formats
You want built-in catalog management
Use self-hosted Trino instead when:
You need full control over cluster configuration
Cost optimization is critical
You have existing Trino infrastructure
You need specific Trino plugins/connectors
Troubleshooting¶
Authentication Failed (401)¶
Starburst Galaxy authentication failed.
Solutions:
Verify username format:
email/role(e.g.,joe@example.com/accountadmin)Check password is correct
Verify credentials at galaxy.starburst.io
Connection Refused¶
Cannot connect to Starburst Galaxy at {host}:{port}
Solutions:
Verify host is correct:
{cluster-name}.trino.galaxy.starburst.ioCheck network connectivity
Verify no firewall blocking port 443
SSL Certificate Error¶
SSL certificate error connecting to Starburst Galaxy.
Solutions:
Check network proxy settings
Verify SSL certificate chain
Use
--platform-option verify_ssl=false(not recommended for production)
Catalog Not Found¶
Catalog 'my_catalog' does not exist
Solutions:
Create the catalog in Starburst Galaxy console
Use an existing catalog: check available catalogs in the console
Omit catalog option to use the default