Snowflake Platform¶
Snowflake is a cloud-native data warehouse with automatic scaling, multi-cluster compute, and separation of storage and compute. BenchBox provides comprehensive support for Snowflake benchmarking across all cloud providers (AWS, Azure, GCP).
Features¶
Auto-scaling compute - Virtual warehouses scale independently
Multi-cluster warehouses - Concurrent query capacity
Zero-copy cloning - Instant database clones for testing
Time travel - Historical data access
Tri-cloud support - AWS, Azure, and GCP
Prerequisites¶
Snowflake account (trial or paid)
Virtual warehouse with appropriate size
Database with CREATE TABLE permissions
Internal or external stage for data loading
Installation¶
# Install Snowflake connector
pip install snowflake-connector-python
# Or via BenchBox extras
pip install "benchbox[snowflake]"
Configuration¶
Environment Variables (Recommended)¶
export SNOWFLAKE_ACCOUNT=your_account_identifier
export SNOWFLAKE_USER=your_username
export SNOWFLAKE_PASSWORD=your_password
export SNOWFLAKE_WAREHOUSE=your_warehouse
export SNOWFLAKE_DATABASE=BENCHBOX
export SNOWFLAKE_SCHEMA=PUBLIC
Interactive Setup¶
benchbox platforms setup --platform snowflake
CLI Options¶
benchbox run --platform snowflake --benchmark tpch --scale 1.0 \
--platform-option account=your_account \
--platform-option user=your_user \
--platform-option password=your_password \
--platform-option warehouse=COMPUTE_WH \
--platform-option database=BENCHBOX
Platform Options¶
Option |
Default |
Description |
|---|---|---|
|
(env) |
Snowflake account identifier |
|
(env) |
Username for authentication |
|
(env) |
Password for authentication |
|
(env) |
Virtual warehouse name |
|
(auto) |
Database name |
|
PUBLIC |
Schema name |
|
(default) |
Snowflake role |
|
(auto) |
Stage for data loading |
|
user |
Stage type: user, table, external |
Authentication Methods¶
Password Authentication¶
# Basic password auth
benchbox run --platform snowflake --benchmark tpch \
--platform-option account=xy12345.us-east-1 \
--platform-option user=benchbox_user \
--platform-option password=secure_password
Key Pair Authentication¶
# Generate key pair
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
# Upload public key to Snowflake
# ALTER USER benchbox_user SET RSA_PUBLIC_KEY='MII...';
# Use key pair
benchbox run --platform snowflake --benchmark tpch \
--platform-option account=xy12345.us-east-1 \
--platform-option user=benchbox_user \
--platform-option private_key_path=/path/to/rsa_key.p8
SSO/Browser Authentication¶
benchbox run --platform snowflake --benchmark tpch \
--platform-option account=xy12345.us-east-1 \
--platform-option user=user@company.com \
--platform-option authenticator=externalbrowser
Usage Examples¶
Basic Benchmark¶
# TPC-H at scale factor 1
benchbox run --platform snowflake --benchmark tpch --scale 1.0 \
--warehouse BENCHMARK_WH
With Tuning¶
# Apply clustering and optimizations
benchbox run --platform snowflake --benchmark tpch --scale 10.0 \
--warehouse LARGE_WH \
--tuning tuned
Python API¶
from benchbox import TPCH
from benchbox.platforms.snowflake import SnowflakeAdapter
adapter = SnowflakeAdapter(
account="xy12345.us-east-1",
user="benchbox_user",
password="secure_password",
warehouse="COMPUTE_WH",
database="BENCHBOX",
)
benchmark = TPCH(scale_factor=1.0)
benchmark.generate_data()
adapter.load_benchmark(benchmark)
results = adapter.run_benchmark(benchmark)
Warehouse Sizing¶
Warehouse Size |
Credits/Hour |
Recommended Scale |
|---|---|---|
X-Small |
1 |
SF 0.01-0.1 |
Small |
2 |
SF 0.1-1.0 |
Medium |
4 |
SF 1.0-10.0 |
Large |
8 |
SF 10.0-100.0 |
X-Large |
16 |
SF 100.0+ |
Performance Features¶
Clustering Keys¶
BenchBox applies clustering keys with --tuning tuned:
ALTER TABLE lineitem CLUSTER BY (l_shipdate);
ALTER TABLE orders CLUSTER BY (o_orderdate);
Result Caching¶
BenchBox disables result caching for accurate benchmarks:
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
Query Tagging¶
Queries are tagged for tracking:
ALTER SESSION SET QUERY_TAG = 'benchbox:tpch:power:Q1';
Data Loading¶
User Stage (Default)¶
Files uploaded to user stage, then loaded via COPY INTO:
# Auto-detected, no configuration needed
benchbox run --platform snowflake --benchmark tpch --scale 1.0
External Stage (S3/Azure/GCS)¶
For large scale factors, use external staging:
# Configure external stage
benchbox run --platform snowflake --benchmark tpch --scale 100.0 \
--staging-root s3://bucket/benchbox/ \
--platform-option stage_type=external \
--platform-option external_stage=@my_s3_stage
Cost Optimization¶
Auto-Suspend¶
Configure warehouses to auto-suspend:
ALTER WAREHOUSE BENCHMARK_WH SET AUTO_SUSPEND = 60;
Multi-Cluster Control¶
For throughput tests:
benchbox run --platform snowflake --benchmark tpch --scale 1.0 \
--phases throughput \
--platform-option max_cluster_count=4 \
--platform-option scaling_policy=standard
Troubleshooting¶
Authentication Failed¶
# Verify account identifier format
# Should be: <account_locator>.<region>.<cloud> or <orgname>-<account_name>
# Examples: xy12345.us-east-1, xy12345.us-east-1.aws, myorg-myaccount
# Test connection
snowsql -a xy12345.us-east-1 -u benchbox_user
Warehouse Not Found¶
# List available warehouses
snowsql -q "SHOW WAREHOUSES;"
# Create warehouse if needed
snowsql -q "CREATE WAREHOUSE BENCHMARK_WH WITH WAREHOUSE_SIZE = 'SMALL';"
Insufficient Permissions¶
-- Grant required permissions
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE benchbox_role;
GRANT CREATE DATABASE ON ACCOUNT TO ROLE benchbox_role;
GRANT CREATE SCHEMA ON DATABASE benchbox TO ROLE benchbox_role;
Data Loading Timeout¶
# Use larger warehouse or external stage for big datasets
benchbox run --platform snowflake --benchmark tpch --scale 100.0 \
--warehouse LARGE_WH \
--staging-root s3://bucket/benchbox/