Velox Docker Dev Workflow¶
The Apache Gluten Velox bundle jar is Linux-only - there are no prebuilt jars for macOS or Windows, and native builds on those hosts are not supported upstream. The docker/velox/ image is therefore the primary local development path on macOS and Windows, and the recommended reproducible path on Linux.
This page covers prerequisites, build steps, both Docker workflows, arch selection, memory sizing, CI integration, and troubleshooting. See Velox Platform for benchmark configuration and adapter options, and Velox Jar Setup for tarball URLs and SHA verification.
Prerequisites¶
Requirement |
Notes |
|---|---|
Docker Desktop ≥ 4.x (macOS/Windows) or Docker Engine ≥ 24 (Linux) |
Docker Compose V2 ( |
|
Included with Docker Desktop; on Linux: |
~12 GB free disk |
Image is ~3 GB; add headroom for build cache and benchmark data |
Internet access at build time |
The Dockerfile fetches the Gluten tarball from |
You do not need Java, PySpark, or the Gluten jar on the host. Everything is inside the image.
Image Overview¶
The image (docker/velox/Dockerfile) layers on top of apache/spark:4.0.2-scala2.13-java17-python3-ubuntu:
Downloads and SHA-512-verifies the official Apache Gluten 1.6.0 release tarball.
Extracts
gluten-velox-bundle-spark4.0_2.13-linux_amd64-1.6.0.jarto/opt/gluten.jar.Builds a thin
ZstdJniCodecjar so Spark can read.zst-compressed benchmark data.Installs BenchBox from the local source tree with
uv pip install ... benchbox[velox].
The entrypoint (docker/velox/entrypoint.sh) supports three modes: connect (Spark-Connect server), run (one-shot benchmark), and shell (interactive debugging).
Building the Image¶
Run all commands from the project root - the build context must include the full BenchBox source tree:
# Quick dev build (single arch, no push)
docker build \
--platform linux/amd64 \
-f docker/velox/Dockerfile \
-t benchbox-velox:dev .
# Verify the build and confirm Velox loads
docker run --rm benchbox-velox:dev python3 -c \
"from benchbox.platforms.velox import VeloxAdapter; print('import OK')"
Distribution Build (docker buildx)¶
docker buildx build \
--platform linux/amd64 \
-f docker/velox/Dockerfile \
-t benchbox-velox:1.6.0 \
--push .
Build Arguments¶
ARG |
Default |
Description |
|---|---|---|
|
|
Gluten release to download |
|
(auto-detected by buildx) |
|
Workflow A - Connect Mode¶
The host runs benchbox; the container runs the Gluten-enabled Spark-Connect server. This is the most flexible workflow: you get the full host BenchBox CLI, local result files, and a clean separation between the client and the Spark+Velox backend.
# 1. Start the server (detached)
cd docker/velox
docker compose up -d velox-connect
# 2. Wait for the health check to pass (~60-90 s on a cold JVM)
docker compose ps velox-connect # watch Status become "healthy"
docker compose logs -f velox-connect # tail logs during startup
# 3. Run benchbox on the host
benchbox run --platform velox \
--platform-option deployment=remote \
--platform-option endpoint=sc://localhost:50051 \
--benchmark tpch --scale 1.0
# 4. Stop the server when done
docker compose down velox-connect
Data Path Contract¶
The Spark server runs inside the container and reads files by their host-side absolute paths (BenchBox sends paths over gRPC, not file contents). The compose file bind-mounts $BENCHBOX_DATA_DIR at the same absolute path inside the container, so host paths resolve identically server-side.
Host: /Users/joe/Developer/BenchBox/benchmark_runs/tpch_sf1/lineitem.parquet
Container: /Users/joe/Developer/BenchBox/benchmark_runs/tpch_sf1/lineitem.parquet
└── same path, mounted :ro
If your data lives outside ./benchmark_runs, set BENCHBOX_DATA_DIR to the absolute path:
BENCHBOX_DATA_DIR=/mnt/benchdata docker compose up -d velox-connect
# Then run benchbox so the paths it sends are under /mnt/benchdata/
The mount is read-only (:ro). Spark’s managed table warehouse is redirected to /tmp/spark-warehouse inside the container.
Workflow B - All-in-One Runner¶
Run BenchBox entirely inside the container using an in-process (local) Gluten session. Simpler for one-shot benchmarks, CI jobs, and situations where you don’t want to keep a server running.
cd docker/velox
# TPC-H SF 0.01 smoke test
docker compose run --rm velox-runner \
--benchmark tpch --scale 0.01
# TPC-H SF 1, specific queries
docker compose run --rm velox-runner \
--benchmark tpch --scale 1.0 --queries Q1,Q6,Q9,Q17
# TPC-DS SF 10 (increase memory - see sizing below)
VELOX_OFFHEAP=24g SPARK_DRIVER_MEM=8g \
docker compose run --rm velox-runner \
--benchmark tpcds --scale 10.0
The entrypoint translates run [args] into:
benchbox run --platform velox \
--platform-option deployment=local \
--platform-option gluten_jar_path=/opt/gluten.jar \
--platform-option offheap_size=${VELOX_OFFHEAP} \
[args]
Arch Selection¶
Apache Gluten 1.6.0 publishes an amd64-only release jar in the official Spark 4.0 tarball. The checked-in Dockerfile and compose file default to linux/amd64.
Host |
Supported? |
Notes |
|---|---|---|
Intel Linux |
Yes, natively |
Full benchmark validity |
Intel macOS |
Yes, via Docker |
Natively |
Apple Silicon (M1/M2/M3) |
Smoke only |
|
Windows (Docker Desktop, Linux containers) |
Smoke only |
Same as Apple Silicon: emulated |
Do not run timing-sensitive benchmarks on emulated linux/amd64. The SIMD paths Velox relies on are emulated, which defeats the purpose of the benchmark.
To force a specific platform:
VELOX_DOCKER_PLATFORM=linux/amd64 docker compose up -d velox-connect
Memory Sizing¶
Velox allocates native memory from a pool separate from the JVM heap. Both must fit within the container’s memory limit.
Variable |
Default |
Controls |
|---|---|---|
|
|
|
|
|
|
Total container memory ≈ VELOX_OFFHEAP + SPARK_DRIVER_MEM + ~1 GB overhead.
Recommended starting points by scale factor:
Scale Factor |
|
|
Total |
|---|---|---|---|
SF 0.01-0.1 (smoke) |
|
|
~7 GB |
SF 1 |
|
|
~13 GB |
SF 10 |
|
|
~25 GB |
SF 100 |
|
|
~49 GB |
If Docker Desktop has a memory cap (Settings → Resources), make sure it exceeds the total. Insufficient off-heap causes OOM errors or forces Velox to fall back to JVM execution silently.
VELOX_OFFHEAP=16g SPARK_DRIVER_MEM=8g docker compose up -d velox-connect
Environment Variables¶
Variable |
Default |
Description |
|---|---|---|
|
|
Image tag used by compose |
|
|
Docker platform for compose services |
|
|
Gluten version (Dockerfile build arg) |
|
|
Off-heap memory budget for Velox |
|
|
JVM driver heap |
|
|
Host port exposed for the Spark-Connect server |
|
|
Bind-mounted at the same absolute path inside the container |
CI Integration¶
For CI pipelines where Docker is available, the all-in-one runner is the simplest integration:
# GitHub Actions example
- name: Build Velox image
run: |
docker build \
--platform linux/amd64 \
-f docker/velox/Dockerfile \
-t benchbox-velox:ci .
- name: Smoke test (TPC-H SF 0.01)
run: |
docker run --rm \
-e VELOX_OFFHEAP=4g \
-e SPARK_DRIVER_MEM=2g \
benchbox-velox:ci \
run --benchmark tpch --scale 0.01 --queries Q1,Q6
The tests/integration/platforms/test_velox_live.py integration tests are gated behind the live_integration marker and expect to run inside the image:
docker run --rm benchbox-velox:ci \
python -m pytest tests/integration/platforms/test_velox_live.py \
-m live_integration -q
Troubleshooting¶
Container starts but never reaches “healthy”¶
The healthcheck probes :50051 every 5 s with a 60 s start window. Spark Connect startup takes 30-90 s on a cold JVM. Check logs before concluding failure:
docker compose logs -f velox-connect
Common causes: insufficient memory (JVM crash), missing SPARK_VERSION file (build issue), or the Gluten jar not found at /opt/gluten.jar.
FileNotFoundException in Spark logs (connect mode)¶
The client sent a path that doesn’t exist inside the container. The bind-mount must cover the data directory at the same absolute path. See the Data Path Contract section above.
velox_active: false after connection¶
The Gluten plugin didn’t activate. In connect mode this usually means the pre-started server wasn’t configured with Gluten - confirm the server was started via this compose file, not a plain Spark server. In local mode, check that --platform-option gluten_jar_path=… points to a valid jar file on Linux.
Off-heap OOM / excessive JVM fallback¶
Increase VELOX_OFFHEAP. Insufficient off-heap causes Velox to either crash or force RowToColumnar inserts before every operator, eliminating the acceleration benefit.
ERROR: TARGETARCH=arm64 is unsupported¶
The Dockerfile was built with --platform linux/arm64. Use --platform linux/amd64 instead. The checked-in Dockerfile only supports the official amd64 release jar.
Apple Silicon: benchmark timings look wrong¶
You are running linux/amd64 under Rosetta emulation. The timings are not representative of native Velox performance. Use a native x86_64 Linux host for benchmark runs.
See Also¶
Velox Platform - adapter options, CLI reference, usage examples
Velox Jar Setup - tarball URLs, SHA verification, known version table