Velox Docker Dev Workflow¶

Tags guide velox gluten docker local-dev

The Apache Gluten Velox bundle jar is Linux-only - there are no prebuilt jars for macOS or Windows, and native builds on those hosts are not supported upstream. The docker/velox/ image is therefore the primary local development path on macOS and Windows, and the recommended reproducible path on Linux.

This page covers prerequisites, build steps, both Docker workflows, arch selection, memory sizing, CI integration, and troubleshooting. See Velox Platform for benchmark configuration and adapter options, and Velox Jar Setup for tarball URLs and SHA verification.

Prerequisites¶

Requirement	Notes
Docker Desktop ≥ 4.x (macOS/Windows) or Docker Engine ≥ 24 (Linux)	Docker Compose V2 (`docker compose`, not `docker-compose`) required
`docker buildx`	Included with Docker Desktop; on Linux: `docker buildx install`
~12 GB free disk	Image is ~3 GB; add headroom for build cache and benchmark data
Internet access at build time	The Dockerfile fetches the Gluten tarball from `downloads.apache.org` during `docker build`

You do not need Java, PySpark, or the Gluten jar on the host. Everything is inside the image.

Image Overview¶

The image (docker/velox/Dockerfile) layers on top of apache/spark:4.0.2-scala2.13-java17-python3-ubuntu:

Downloads and SHA-512-verifies the official Apache Gluten 1.6.0 release tarball.
Extracts gluten-velox-bundle-spark4.0_2.13-linux_amd64-1.6.0.jar to /opt/gluten.jar.
Builds a thin ZstdJniCodec jar so Spark can read .zst-compressed benchmark data.
Installs BenchBox from the local source tree with uv pip install ... benchbox[velox].

The entrypoint (docker/velox/entrypoint.sh) supports three modes: connect (Spark-Connect server), run (one-shot benchmark), and shell (interactive debugging).

Building the Image¶

Run all commands from the project root - the build context must include the full BenchBox source tree:

# Quick dev build (single arch, no push)
docker build \
  --platform linux/amd64 \
  -f docker/velox/Dockerfile \
  -t benchbox-velox:dev .

# Verify the build and confirm Velox loads
docker run --rm benchbox-velox:dev python3 -c \
  "from benchbox.platforms.velox import VeloxAdapter; print('import OK')"

Distribution Build (docker buildx)¶

docker buildx build \
  --platform linux/amd64 \
  -f docker/velox/Dockerfile \
  -t benchbox-velox:1.6.0 \
  --push .

Build Arguments¶

ARG	Default	Description
`GLUTEN_VERSION`	`1.6.0`	Gluten release to download
`TARGETARCH`	(auto-detected by buildx)	`amd64` or `arm64`; only `amd64` is supported by the official jar

Workflow A - Connect Mode¶

The host runs benchbox; the container runs the Gluten-enabled Spark-Connect server. This is the most flexible workflow: you get the full host BenchBox CLI, local result files, and a clean separation between the client and the Spark+Velox backend.

# 1. Start the server (detached)
cd docker/velox
docker compose up -d velox-connect

# 2. Wait for the health check to pass (~60-90 s on a cold JVM)
docker compose ps velox-connect       # watch Status become "healthy"
docker compose logs -f velox-connect  # tail logs during startup

# 3. Run benchbox on the host
benchbox run --platform velox \
  --platform-option deployment=remote \
  --platform-option endpoint=sc://localhost:50051 \
  --benchmark tpch --scale 1.0

# 4. Stop the server when done
docker compose down velox-connect

Data Path Contract¶

The Spark server runs inside the container and reads files by their host-side absolute paths (BenchBox sends paths over gRPC, not file contents). The compose file bind-mounts $BENCHBOX_DATA_DIR at the same absolute path inside the container, so host paths resolve identically server-side.

Host:      /Users/joe/Developer/BenchBox/benchmark_runs/tpch_sf1/lineitem.parquet
Container: /Users/joe/Developer/BenchBox/benchmark_runs/tpch_sf1/lineitem.parquet
           └── same path, mounted :ro

If your data lives outside ./benchmark_runs, set BENCHBOX_DATA_DIR to the absolute path:

BENCHBOX_DATA_DIR=/mnt/benchdata docker compose up -d velox-connect
# Then run benchbox so the paths it sends are under /mnt/benchdata/

The mount is read-only (:ro). Spark’s managed table warehouse is redirected to /tmp/spark-warehouse inside the container.

Workflow B - All-in-One Runner¶

Run BenchBox entirely inside the container using an in-process (local) Gluten session. Simpler for one-shot benchmarks, CI jobs, and situations where you don’t want to keep a server running.

cd docker/velox

# TPC-H SF 0.01 smoke test
docker compose run --rm velox-runner \
  --benchmark tpch --scale 0.01

# TPC-H SF 1, specific queries
docker compose run --rm velox-runner \
  --benchmark tpch --scale 1.0 --queries Q1,Q6,Q9,Q17

# TPC-DS SF 10 (increase memory - see sizing below)
VELOX_OFFHEAP=24g SPARK_DRIVER_MEM=8g \
docker compose run --rm velox-runner \
  --benchmark tpcds --scale 10.0

The entrypoint translates run [args] into:

benchbox run --platform velox \
  --platform-option deployment=local \
  --platform-option gluten_jar_path=/opt/gluten.jar \
  --platform-option offheap_size=${VELOX_OFFHEAP} \
  [args]

Arch Selection¶

Apache Gluten 1.6.0 publishes an amd64-only release jar in the official Spark 4.0 tarball. The checked-in Dockerfile and compose file default to linux/amd64.

Host	Supported?	Notes
Intel Linux	Yes, natively	Full benchmark validity
Intel macOS	Yes, via Docker	Natively `linux/amd64`; benchmark timings valid
Apple Silicon (M1/M2/M3)	Smoke only	`linux/amd64` runs under Rosetta emulation; timings are invalid for benchmarking. Use a native x86_64 Linux host for real numbers.
Windows (Docker Desktop, Linux containers)	Smoke only	Same as Apple Silicon: emulated `linux/amd64`; timings invalid

Do not run timing-sensitive benchmarks on emulated linux/amd64. The SIMD paths Velox relies on are emulated, which defeats the purpose of the benchmark.

To force a specific platform:

VELOX_DOCKER_PLATFORM=linux/amd64 docker compose up -d velox-connect

Memory Sizing¶

Velox allocates native memory from a pool separate from the JVM heap. Both must fit within the container’s memory limit.

Variable	Default	Controls
`VELOX_OFFHEAP`	`8g`	`spark.memory.offHeap.size` - Velox’s native pool
`SPARK_DRIVER_MEM`	`4g`	`spark.driver.memory` - JVM heap

Total container memory ≈ VELOX_OFFHEAP + SPARK_DRIVER_MEM + ~1 GB overhead.

Recommended starting points by scale factor:

Scale Factor	`VELOX_OFFHEAP`	`SPARK_DRIVER_MEM`	Total
SF 0.01-0.1 (smoke)	`4g`	`2g`	~7 GB
SF 1	`8g`	`4g`	~13 GB
SF 10	`16g`	`8g`	~25 GB
SF 100	`32g`	`16g`	~49 GB

If Docker Desktop has a memory cap (Settings → Resources), make sure it exceeds the total. Insufficient off-heap causes OOM errors or forces Velox to fall back to JVM execution silently.

VELOX_OFFHEAP=16g SPARK_DRIVER_MEM=8g docker compose up -d velox-connect

Environment Variables¶

Variable	Default	Description
`VELOX_IMAGE_TAG`	`dev`	Image tag used by compose
`VELOX_DOCKER_PLATFORM`	`linux/amd64`	Docker platform for compose services
`GLUTEN_VERSION`	`1.6.0`	Gluten version (Dockerfile build arg)
`VELOX_OFFHEAP`	`8g`	Off-heap memory budget for Velox
`SPARK_DRIVER_MEM`	`4g`	JVM driver heap
`SPARK_CONNECT_PORT`	`50051`	Host port exposed for the Spark-Connect server
`BENCHBOX_DATA_DIR`	`./benchmark_runs`	Bind-mounted at the same absolute path inside the container

CI Integration¶

For CI pipelines where Docker is available, the all-in-one runner is the simplest integration:

# GitHub Actions example
- name: Build Velox image
  run: |
    docker build \
      --platform linux/amd64 \
      -f docker/velox/Dockerfile \
      -t benchbox-velox:ci .

- name: Smoke test (TPC-H SF 0.01)
  run: |
    docker run --rm \
      -e VELOX_OFFHEAP=4g \
      -e SPARK_DRIVER_MEM=2g \
      benchbox-velox:ci \
      run --benchmark tpch --scale 0.01 --queries Q1,Q6

The tests/integration/platforms/test_velox_live.py integration tests are gated behind the live_integration marker and expect to run inside the image:

docker run --rm benchbox-velox:ci \
  python -m pytest tests/integration/platforms/test_velox_live.py \
  -m live_integration -q

Troubleshooting¶

Container starts but never reaches “healthy”¶

The healthcheck probes :50051 every 5 s with a 60 s start window. Spark Connect startup takes 30-90 s on a cold JVM. Check logs before concluding failure:

docker compose logs -f velox-connect

Common causes: insufficient memory (JVM crash), missing SPARK_VERSION file (build issue), or the Gluten jar not found at /opt/gluten.jar.

`FileNotFoundException` in Spark logs (connect mode)¶

The client sent a path that doesn’t exist inside the container. The bind-mount must cover the data directory at the same absolute path. See the Data Path Contract section above.

`velox_active: false` after connection¶

The Gluten plugin didn’t activate. In connect mode this usually means the pre-started server wasn’t configured with Gluten - confirm the server was started via this compose file, not a plain Spark server. In local mode, check that --platform-option gluten_jar_path=… points to a valid jar file on Linux.

Off-heap OOM / excessive JVM fallback¶

Increase VELOX_OFFHEAP. Insufficient off-heap causes Velox to either crash or force RowToColumnar inserts before every operator, eliminating the acceleration benefit.

`ERROR: TARGETARCH=arm64 is unsupported`¶

The Dockerfile was built with --platform linux/arm64. Use --platform linux/amd64 instead. The checked-in Dockerfile only supports the official amd64 release jar.

Apple Silicon: benchmark timings look wrong¶

You are running linux/amd64 under Rosetta emulation. The timings are not representative of native Velox performance. Use a native x86_64 Linux host for benchmark runs.