TSBS DevOps Benchmark¶
Overview¶
The Time Series Benchmark Suite (TSBS) DevOps benchmark simulates infrastructure monitoring workloads typical of DevOps and observability platforms. Based on the official TSBS implementation by Timescale, this benchmark generates realistic time-series data representing CPU, memory, disk, and network metrics from a fleet of monitored hosts.
The benchmark is ideal for evaluating time-series databases, OLAP systems handling temporal data, and infrastructure monitoring solutions.
Key Features¶
Realistic metrics - CPU, memory, disk I/O, and network statistics
Host metadata - Tags for region, datacenter, service, team
Diurnal patterns - Realistic daily usage patterns
Configurable scale - From 10 hosts to thousands
18 DevOps queries - Common monitoring and alerting patterns
Multiple dialects - Standard SQL, ClickHouse, TimescaleDB, InfluxDB support
Data Model¶
The TSBS DevOps benchmark uses a dimensional model with a tags table and four metric tables:
Tables¶
Table |
Purpose |
Rows per SF=1 |
|---|---|---|
tags |
Host metadata and dimensions |
100 |
cpu |
CPU usage metrics per timestamp |
~864,000 |
mem |
Memory metrics per timestamp |
~864,000 |
disk |
Disk I/O metrics per device |
~1,728,000 |
net |
Network metrics per interface |
~1,728,000 |
cpu Table¶
Column |
Type |
Description |
|---|---|---|
|
TIMESTAMP |
Measurement timestamp (PK) |
|
VARCHAR |
Host identifier (PK) |
|
DOUBLE |
CPU % in user space |
|
DOUBLE |
CPU % in kernel space |
|
DOUBLE |
CPU % idle |
|
DOUBLE |
CPU % nice priority |
|
DOUBLE |
CPU % waiting for I/O |
|
DOUBLE |
CPU % hardware interrupts |
|
DOUBLE |
CPU % software interrupts |
|
DOUBLE |
CPU % stolen by hypervisor |
|
DOUBLE |
CPU % running guest VMs |
|
DOUBLE |
CPU % guest nice priority |
mem Table¶
Column |
Type |
Description |
|---|---|---|
|
TIMESTAMP |
Measurement timestamp (PK) |
|
VARCHAR |
Host identifier (PK) |
|
BIGINT |
Total memory bytes |
|
BIGINT |
Available memory bytes |
|
BIGINT |
Used memory bytes |
|
BIGINT |
Free memory bytes |
|
BIGINT |
Cached memory bytes |
|
BIGINT |
Buffered memory bytes |
|
DOUBLE |
Memory usage percent |
|
DOUBLE |
Available memory percent |
disk Table¶
Column |
Type |
Description |
|---|---|---|
|
TIMESTAMP |
Measurement timestamp (PK) |
|
VARCHAR |
Host identifier (PK) |
|
VARCHAR |
Disk device name (PK) |
|
BIGINT |
Total read operations |
|
BIGINT |
Total write operations |
|
BIGINT |
Read time in milliseconds |
|
BIGINT |
Write time in milliseconds |
|
INTEGER |
Current I/O operations |
net Table¶
Column |
Type |
Description |
|---|---|---|
|
TIMESTAMP |
Measurement timestamp (PK) |
|
VARCHAR |
Host identifier (PK) |
|
VARCHAR |
Network interface name (PK) |
|
BIGINT |
Bytes received |
|
BIGINT |
Bytes sent |
|
BIGINT |
Packets received |
|
BIGINT |
Packets sent |
|
BIGINT |
Receive errors |
|
BIGINT |
Send errors |
|
BIGINT |
Dropped incoming packets |
|
BIGINT |
Dropped outgoing packets |
Query Categories¶
The benchmark includes 18 queries organized into categories:
Single Host Queries¶
Metrics for individual hosts over time ranges:
single-host-12-hr: CPU usage for one host over 12 hourssingle-host-1-hr: Detailed CPU for one host over 1 hour
Aggregation Queries¶
Cross-host aggregations:
cpu-max-all-1-hr: Maximum CPU across all hosts (1 hour)cpu-max-all-8-hr: Maximum CPU across all hosts (8 hours)
GroupBy Queries¶
Time-bucketed aggregations:
double-groupby-1-hr: CPU grouped by host and minutedouble-groupby-5-min: Fine-grained CPU grouping
Threshold Queries¶
Alert-style threshold filters:
high-cpu-1-hr: Hosts with CPU > 90%high-cpu-12-hr: Sustained high CPU hostslow-memory-hosts: Hosts with available memory < 10%net-errors: Hosts with network errors
Memory Queries¶
Memory-specific analytics:
mem-by-host-1-hr: Memory statistics per host
Disk Queries¶
Disk I/O analytics:
disk-iops-1-hr: Read/write operations per hostdisk-latency: Average disk latency analysis
Network Queries¶
Network throughput analytics:
net-throughput-1-hr: Bytes sent/received per host
Combined Queries¶
Cross-metric correlation:
resource-utilization: Combined CPU and memory per host
Lastpoint Queries¶
Most recent values (common in dashboards):
lastpoint: Most recent metrics per host
Tag-filtered Queries¶
Filtering by host metadata:
by-region: Metrics filtered by cloud regionby-service: Metrics grouped by service
Usage Examples¶
Basic Benchmark Setup¶
from benchbox import TSBSDevOps
# Initialize TSBS DevOps benchmark (SF=1 = 100 hosts, 1 day)
tsbs = TSBSDevOps(scale_factor=1.0, output_dir="tsbs_data")
# Generate time-series data
data_files = tsbs.generate_data()
# Get all queries
queries = tsbs.get_queries()
print(f"Generated {len(queries)} TSBS queries")
# Get specific query
cpu_query = tsbs.get_query("cpu-max-all-1-hr")
print(cpu_query)
Custom Configuration¶
# Configure specific hosts and duration
tsbs_custom = TSBSDevOps(
scale_factor=0.5,
output_dir="tsbs_custom",
num_hosts=50, # Override: 50 hosts
duration_days=7, # Override: 7 days of data
interval_seconds=60, # 1-minute intervals
)
data_files = tsbs_custom.generate_data()
DuckDB Integration¶
import duckdb
from benchbox import TSBSDevOps
# Initialize and generate data
tsbs = TSBSDevOps(scale_factor=0.1, output_dir="tsbs_small")
data_files = tsbs.generate_data()
# Create DuckDB connection and schema
conn = duckdb.connect("tsbs.duckdb")
schema_sql = tsbs.get_create_tables_sql(dialect="duckdb")
for stmt in schema_sql.split(";"):
if stmt.strip():
conn.execute(stmt)
# Load data
for table_name, file_path in tsbs.tables.items():
conn.execute(f"""
INSERT INTO {table_name}
SELECT * FROM read_csv('{file_path}', header=true, auto_detect=true)
""")
# Run queries
for query_id in ["cpu-max-all-1-hr", "high-cpu-1-hr", "lastpoint"]:
query_sql = tsbs.get_query(query_id)
result = conn.execute(query_sql).fetchall()
print(f"{query_id}: {len(result)} rows")
conn.close()
TimescaleDB Integration¶
from benchbox import TSBSDevOps
tsbs = TSBSDevOps(scale_factor=1.0)
# Get TimescaleDB-optimized schema with hypertables
schema_sql = tsbs.get_create_tables_sql(
dialect="timescale",
time_partitioning=True,
)
print(schema_sql)
# Includes: SELECT create_hypertable('cpu', 'time', ...)
ClickHouse Integration¶
from benchbox import TSBSDevOps
tsbs = TSBSDevOps(scale_factor=1.0)
# Get ClickHouse-optimized schema
schema_sql = tsbs.get_create_tables_sql(
dialect="clickhouse",
time_partitioning=True,
)
print(schema_sql)
# Includes: ENGINE = MergeTree() ORDER BY (...) PARTITION BY toYYYYMMDD(time)
InfluxDB Integration¶
InfluxDB 3.x uses FlightSQL for SQL queries and Line Protocol for data ingestion. BenchBox handles this automatically via the InfluxDB adapter.
from benchbox.platforms.influxdb import InfluxDBAdapter
from benchbox import TSBSDevOps
# Initialize TSBS DevOps benchmark
tsbs = TSBSDevOps(scale_factor=0.1, output_dir="tsbs_influx")
data_files = tsbs.generate_data()
# Create InfluxDB adapter (Core/OSS mode)
adapter = InfluxDBAdapter(
mode="core",
host="localhost",
port=8086,
token="your-influxdb-token",
database="benchmarks",
ssl=False,
)
# Create connection
conn = adapter.create_connection()
# InfluxDB auto-creates schema from Line Protocol writes
# Load data (converts CSV to Line Protocol)
row_counts, load_time, metadata = adapter.load_data(tsbs, conn, tsbs.output_dir)
print(f"Loaded {metadata['total_rows']:,} rows in {load_time:.2f}s")
# Get InfluxDB-compatible queries (uses DataFusion SQL)
for query_id in ["cpu-max-all-1-hr", "high-cpu-1-hr", "lastpoint"]:
query_sql = tsbs.get_query(query_id, dialect="influxdb")
exec_time, row_count, _ = adapter.execute_query(conn, query_sql, query_id)
print(f"{query_id}: {row_count} rows in {exec_time:.3f}s")
adapter.close_connection(conn)
InfluxDB Cloud mode:
# InfluxDB Cloud (Serverless/Dedicated/Clustered)
adapter = InfluxDBAdapter(
mode="cloud",
host="us-east-1-1.aws.cloud2.influxdata.com",
token="your-cloud-token",
org="your-org",
database="benchmarks",
)
Key InfluxDB Considerations:
Line Protocol: Data is loaded via InfluxDB’s native Line Protocol format for optimal ingest performance
Schema auto-creation: Tables (measurements) are auto-created on first write
SQL via FlightSQL: Queries use standard SQL (powered by Apache DataFusion)
Tags vs Fields: hostname becomes a tag (indexed), metrics become fields
No DELETE: InfluxDB Core doesn’t support deletes; use retention policies instead
Scale Factor Guidelines¶
Scale Factor |
Hosts |
Duration |
CPU Rows |
Total Rows |
Use Case |
|---|---|---|---|---|---|
0.01 |
10 |
1 day |
~86K |
~430K |
Quick testing |
0.1 |
10 |
1 day |
~86K |
~430K |
Development |
1.0 |
100 |
1 day |
~864K |
~5M |
Standard benchmark |
10.0 |
1000 |
10 days |
~86M |
~500M |
Performance testing |
100.0 |
1000 |
100 days |
~864M |
~5B |
Large scale testing |
Data Generation Patterns¶
The generator creates realistic data with:
Diurnal CPU patterns: Higher usage during business hours (9am-5pm)
Memory growth: Gradual memory increase with periodic GC drops
Disk I/O bursts: 5% chance of 10x burst per interval
Network errors: Rare errors (~0.1%) and drops (~0.2%)
Tag distributions: 75% Linux, 50% production, balanced regions
Performance Characteristics¶
Query Performance Patterns¶
Single Host Queries:
Bottleneck: Time range filtering
Optimization: Index on (hostname, time)
Typical performance: Fast (milliseconds)
Aggregation Queries:
Bottleneck: Full scan of time range
Optimization: Columnar storage, vectorized execution
Typical performance: Medium (seconds)
GroupBy Queries:
Bottleneck: Hash aggregation memory
Optimization: Pre-aggregation, materialized views
Typical performance: Medium to slow
Threshold Queries:
Bottleneck: Filtering efficiency
Optimization: Bloom filters, sparse indexes
Typical performance: Fast with good indexes
Lastpoint Queries:
Bottleneck: Finding max timestamp per group
Optimization: Specialized last-value indexes
Typical performance: Critical for dashboards
Best Practices¶
Data Generation¶
Match your monitoring interval - Use realistic intervals (10s, 30s, 60s)
Scale hosts appropriately - Test with expected fleet size
Consider retention - Duration affects storage testing
Query Optimization¶
Partition by time - Essential for time-series databases
Index on hostname - For single-host query performance
Pre-aggregate - Materialized views for dashboards
Time-Series Database Tips¶
Use native types - TIMESTAMPTZ, DateTime64
Enable compression - Time-series compresses well
Consider downsampling - For long-term storage
External Resources¶
TSBS GitHub Repository - Original implementation
TimescaleDB Documentation - Time-series optimization
InfluxDB Line Protocol - Time-series data format