docker: add NVIDIA/AMD GPU overlays via COMPOSE_FILE (#254)

Opt-in overlays under docker/ that pass the host GPU into the odysseus container. Pick one in .env: COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml Non-GPU users are unaffected (no default merge). README now points at the overlays instead of the old ad-hoc `gpus: all` suggestion. Each overlay header notes that it only exposes the GPU devices — the slim image still needs vLLM / llama-cpp-python / etc. installed via Cookbook -> Dependencies before models can serve on GPU. Tested on Arch + Docker 29.5.1 + RTX 4090: docker compose exec odysseus nvidia-smi -L GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-...) Cookbook hardware scan reports the 24 GB GPU and recommends GPU-fit models. `docker compose config` validates cleanly for all three COMPOSE_FILE variants (base, +nvidia, +amd). Builds on the structure proposed in #91 by @krllus with the path / docs fixes from the review on that PR. Closes #163. Co-authored-by: krllus <krllus@users.noreply.github.com>
2026-06-01 15:00:09 +10:00
parent 2537b80f88
commit 4c0aadbb5e
4 changed files with 78 additions and 3 deletions
--- a/.env.example
+++ b/.env.example
@@ -123,3 +123,22 @@ SEARXNG_INSTANCE=http://localhost:8080
 # Empty/local/localhost runs scripts on the app host. Set to an SSH host alias
 # if you intentionally want scheduled scripts to run remotely.
 # ODYSSEUS_SCRIPT_HOST=localhost
+
+# ============================================================
+# GPU support (Docker Compose)
+# ============================================================
+# Pass the host GPU into the odysseus container. Default (unset) = CPU.
+# COMPOSE_FILE is a native `docker compose` feature: a colon-separated
+# list of files merged left-to-right. Pick ONE GPU line below, or leave
+# all commented for CPU.
+#
+# NVIDIA (requires nvidia-container-toolkit + `nvidia-ctk runtime
+# configure --runtime=docker` on the host):
+# COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
+#
+# AMD ROCm (requires ROCm drivers on the host):
+# COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
+#
+# These overlays only expose the GPU devices. The slim Odysseus image
+# still needs CUDA/ROCm userspace via Cookbook -> Dependencies (vLLM,
+# llama-cpp-python, etc.) before models can actually serve on GPU.
--- a/README.md
+++ b/README.md
@@ -73,9 +73,18 @@ serve engines and Python CLIs are stored in `./data/local`, mounted as

 After downloading a model, open **Cookbook -> Serve**, pick the cached model,
 and launch it. When the server answers `/v1/models`, Odysseus adds it to the
-chat model picker automatically. For NVIDIA GPUs in Docker, install the NVIDIA
-Container Toolkit and add `gpus: all` to the `odysseus` service if `nvidia-smi`
-is not visible inside the container.
+chat model picker automatically. For NVIDIA / AMD GPUs in Docker, install
+the host runtime (NVIDIA Container Toolkit or ROCm drivers) and enable the
+matching overlay via `COMPOSE_FILE` in `.env`:
+
+```bash
+# NVIDIA
+COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
+# AMD ROCm
+COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
+```
+
+Verify with `docker compose exec odysseus nvidia-smi -L` (or `rocm-smi`).

 The default Docker image is intentionally slim. For Python-based serve engines,
 use **Cookbook -> Dependencies** to install vLLM, SGLang, llama-cpp-python, or
--- a/docker/gpu.amd.yml
+++ b/docker/gpu.amd.yml
@@ -0,0 +1,18 @@
+# AMD ROCm GPU overlay. Enable by setting COMPOSE_FILE in .env:
+#   COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
+#
+# Requires ROCm drivers on the host (kfd + DRI devices). The host user
+# running Docker must be in the `video` and `render` groups.
+#
+# This overlay only passes the host GPU through to the container.
+# The slim Odysseus image does not bundle ROCm userspace or inference
+# engines — install ROCm-compatible builds of vLLM / llama-cpp-python
+# via Cookbook -> Dependencies (or pip) before serving GPU models.
+services:
+  odysseus:
+    devices:
+      - /dev/kfd
+      - /dev/dri
+    group_add:
+      - video
+      - render
--- a/docker/gpu.nvidia.yml
+++ b/docker/gpu.nvidia.yml
@@ -0,0 +1,29 @@
+# NVIDIA GPU overlay. Enable by setting COMPOSE_FILE in .env:
+#   COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
+#
+# Requires the NVIDIA Container Toolkit on the host.
+#   Arch:    sudo pacman -S nvidia-container-toolkit
+#   Debian:  sudo apt install nvidia-container-toolkit
+#   Fedora:  sudo dnf install nvidia-container-toolkit
+# Then:
+#   sudo nvidia-ctk runtime configure --runtime=docker
+#   sudo systemctl restart docker
+# Verify with:
+#   docker info | grep -i nvidia
+#
+# This overlay only passes the host GPU through to the container.
+# The slim Odysseus image does not bundle CUDA userspace or inference
+# engines — install vLLM / llama-cpp-python / SGLang via
+# Cookbook -> Dependencies (or pip) before serving GPU models.
+services:
+  odysseus:
+    environment:
+      - NVIDIA_VISIBLE_DEVICES=all
+      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: all
+              capabilities: [gpu]