Diagnose vLLM device detection failure with actionable suggestion (#778)

Adds a diagnosis pattern for the 'Failed to infer device type' error
vLLM raises when no CUDA or ROCm GPU is found (e.g. systems with only
integrated or Intel Xe graphics). The existing pattern only caught
'No CUDA GPUs are available' which fires later in startup; this new
entry catches the earlier device-probe failure and the NVML/amdsmi
library-not-found messages that precede it.

Surfaces in the Cookbook serve card as: "vLLM could not find a supported
GPU — switch to llama.cpp or Ollama" instead of a raw Python traceback.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Dustin
2026-06-01 20:30:07 -07:00
committed by GitHub
parent 385c3c3cf3
commit bd3204fe96

View File

@@ -148,6 +148,15 @@ def setup_cookbook_routes() -> APIRouter:
"No GPUs are visible to the serve process.",
[{"label": "clear Cookbook GPU selection or choose available GPUs", "op": "settings", "field": "gpus", "value": ""}],
),
(
r"Failed to infer device type|NVML Shared Library Not Found|No module named 'amdsmi'|platform is not available",
"vLLM could not find a supported GPU (CUDA or ROCm). "
"This machine may have integrated or unsupported graphics only.",
[
{"label": "switch to llama.cpp (CPU/Metal, works without a discrete GPU)", "op": "manual"},
{"label": "switch to Ollama (CPU/Metal, works without a discrete GPU)", "op": "manual"},
],
),
(
r"vllm.*command not found|No module named vllm|ERROR: vLLM is not installed",
"vLLM is not installed or not in PATH on this server.",