Files
odysseus/docker/gpu.nvidia.yml
Alexandre Teixeira 8455b88643 Improve Docker GPU setup diagnostics (#705)
* Improve Docker GPU setup diagnostics

Add a Docker GPU preflight script for NVIDIA users. The script is
read-only by default, checks host NVIDIA drivers, Docker availability,
and container GPU passthrough, and prints actionable next steps.

Add explicit opt-in modes to print install commands, install NVIDIA
Container Toolkit on Ubuntu/Debian, and enable the NVIDIA Compose overlay
in .env after passthrough is verified.

Document common NVIDIA Docker failure modes, ignore generated .env
backups, and clarify that Cookbook can only detect GPUs exposed to the
Odysseus container.

* Clarify Docker GPU diagnostic limits
2026-06-02 12:30:40 +09:00

35 lines
1.3 KiB
YAML

# NVIDIA GPU overlay. Enable by setting COMPOSE_FILE in .env:
# COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
#
# Use scripts/check-docker-gpu.sh to diagnose GPU passthrough, optionally
# install the NVIDIA Container Toolkit (Ubuntu/Debian), and write COMPOSE_FILE
# to .env. The script is read-only by default — it installs nothing and never
# edits .env unless explicitly asked.
#
# Requires the NVIDIA Container Toolkit on the host.
# Arch: sudo pacman -S nvidia-container-toolkit
# Debian: sudo apt install nvidia-container-toolkit
# Fedora: sudo dnf install nvidia-container-toolkit
# Then:
# sudo nvidia-ctk runtime configure --runtime=docker
# sudo systemctl restart docker
# Verify with:
# docker info | grep -i nvidia
#
# This overlay only passes the host GPU through to the container.
# The slim Odysseus image does not bundle CUDA userspace or inference
# engines — install vLLM / llama-cpp-python / SGLang via
# Cookbook -> Dependencies (or pip) before serving GPU models.
services:
odysseus:
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]