diff --git a/.env.example b/.env.example index dfbbe6d..5add859 100644 --- a/.env.example +++ b/.env.example @@ -123,3 +123,22 @@ SEARXNG_INSTANCE=http://localhost:8080 # Empty/local/localhost runs scripts on the app host. Set to an SSH host alias # if you intentionally want scheduled scripts to run remotely. # ODYSSEUS_SCRIPT_HOST=localhost + +# ============================================================ +# GPU support (Docker Compose) +# ============================================================ +# Pass the host GPU into the odysseus container. Default (unset) = CPU. +# COMPOSE_FILE is a native `docker compose` feature: a colon-separated +# list of files merged left-to-right. Pick ONE GPU line below, or leave +# all commented for CPU. +# +# NVIDIA (requires nvidia-container-toolkit + `nvidia-ctk runtime +# configure --runtime=docker` on the host): +# COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml +# +# AMD ROCm (requires ROCm drivers on the host): +# COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml +# +# These overlays only expose the GPU devices. The slim Odysseus image +# still needs CUDA/ROCm userspace via Cookbook -> Dependencies (vLLM, +# llama-cpp-python, etc.) before models can actually serve on GPU. diff --git a/README.md b/README.md index 99e3e26..4c6042b 100644 --- a/README.md +++ b/README.md @@ -73,9 +73,18 @@ serve engines and Python CLIs are stored in `./data/local`, mounted as After downloading a model, open **Cookbook -> Serve**, pick the cached model, and launch it. When the server answers `/v1/models`, Odysseus adds it to the -chat model picker automatically. For NVIDIA GPUs in Docker, install the NVIDIA -Container Toolkit and add `gpus: all` to the `odysseus` service if `nvidia-smi` -is not visible inside the container. +chat model picker automatically. For NVIDIA / AMD GPUs in Docker, install +the host runtime (NVIDIA Container Toolkit or ROCm drivers) and enable the +matching overlay via `COMPOSE_FILE` in `.env`: + +```bash +# NVIDIA +COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml +# AMD ROCm +COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml +``` + +Verify with `docker compose exec odysseus nvidia-smi -L` (or `rocm-smi`). The default Docker image is intentionally slim. For Python-based serve engines, use **Cookbook -> Dependencies** to install vLLM, SGLang, llama-cpp-python, or diff --git a/docker/gpu.amd.yml b/docker/gpu.amd.yml new file mode 100644 index 0000000..6a0ac39 --- /dev/null +++ b/docker/gpu.amd.yml @@ -0,0 +1,18 @@ +# AMD ROCm GPU overlay. Enable by setting COMPOSE_FILE in .env: +# COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml +# +# Requires ROCm drivers on the host (kfd + DRI devices). The host user +# running Docker must be in the `video` and `render` groups. +# +# This overlay only passes the host GPU through to the container. +# The slim Odysseus image does not bundle ROCm userspace or inference +# engines — install ROCm-compatible builds of vLLM / llama-cpp-python +# via Cookbook -> Dependencies (or pip) before serving GPU models. +services: + odysseus: + devices: + - /dev/kfd + - /dev/dri + group_add: + - video + - render diff --git a/docker/gpu.nvidia.yml b/docker/gpu.nvidia.yml new file mode 100644 index 0000000..32f7fb2 --- /dev/null +++ b/docker/gpu.nvidia.yml @@ -0,0 +1,29 @@ +# NVIDIA GPU overlay. Enable by setting COMPOSE_FILE in .env: +# COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml +# +# Requires the NVIDIA Container Toolkit on the host. +# Arch: sudo pacman -S nvidia-container-toolkit +# Debian: sudo apt install nvidia-container-toolkit +# Fedora: sudo dnf install nvidia-container-toolkit +# Then: +# sudo nvidia-ctk runtime configure --runtime=docker +# sudo systemctl restart docker +# Verify with: +# docker info | grep -i nvidia +# +# This overlay only passes the host GPU through to the container. +# The slim Odysseus image does not bundle CUDA userspace or inference +# engines — install vLLM / llama-cpp-python / SGLang via +# Cookbook -> Dependencies (or pip) before serving GPU models. +services: + odysseus: + environment: + - NVIDIA_VISIBLE_DEVICES=all + - NVIDIA_DRIVER_CAPABILITIES=compute,utility + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu]