4 Commits

Author SHA1 Message Date
Alexandre Teixeira
3b292403dc fix(tests): accept verify in endpoint HTTP mocks
Updates endpoint/model-route test HTTP mocks to accept the verify keyword argument passed by endpoint probing code. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 17:53:18 +01:00
Michael Gerber
e392be0d65 fix: Cookbook local GGUF serving inside Docker (#1264)
* fix: Cookbook local GGUF serving inside Docker

Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint.

Fixes included here:

use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub
fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts
preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal
Before this change, Docker local serves could fail in several ways:

Cookbook pointed llama.cpp at the wrong GGUF path
generated serve runner scripts crashed before launch with a shell syntax error
successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1
This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration

* test: add test for docker-local endpoint rewrites
2026-06-03 02:08:09 +09:00
ghreprimand
1fda906407 Fix Cookbook container-local model endpoints (#1223)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 00:09:48 +09:00
Collin
f8e3bfeaff Add endpoint probing behavior tests
ROADMAP "Backend → more tests around endpoint probing and provider setup".
TestSetupProbeSafety already covers _probe_endpoint's keyed/unkeyed curated
fallback; this adds the rest of the probe surface, with httpx faked the same
way (no network):

- _probe_endpoint: OpenAI {"data"} vs native Ollama {"models"} list parsing,
  the /api/tags fallback for Ollama builds lacking /v1/models, and the
  no-models-found result.
- _ping_endpoint (previously untested): 2xx reachable, auth failure (reached
  but not reachable), the /login-redirect "that's Odysseus, not a model
  server" trap, generic redirects, transport errors, and the native Ollama
  /api/version fallback.
- _probe_single_model (previously untested): ok/fail/timeout status mapping,
  dict/string upstream error extraction, and OpenAI vs Anthropic request
  routing (x-api-key, /v1/messages, tool schema).
- _classify_endpoint: the Tailscale CGNAT 100.64.0.0/10 local range and its
  boundaries.
2026-06-02 20:42:48 +09:00