odysseus

Author	SHA1	Message	Date
spooky	f9e1d38cc2	fix: diagnose vllm serve runtime issues (#1198 )	2026-06-05 11:03:04 +01:00
Lucas Daniel	f5d834b0c5	fix(cookbook): surface backend diagnosis when serve fails in background (#1636 ) * refactor(cookbook): move _diagnose_serve_output to module level in cookbook_helpers Extracts the nested _diagnose_serve_output function from inside setup_cookbook_routes() and moves it to module level in cookbook_helpers.py, alongside the other helper functions it logically belongs with. No behaviour change — the function is now importable directly for testing and by other callers without going through the route factory closure. * fix(cookbook): surface backend diagnosis when serve fails in background The background poll (_pollBackgroundStatus) already received `diagnosis` and `cmd` from /api/cookbook/tasks/status but discarded both. When a serve job died while the Cookbook modal was closed, reopening it showed only a red error badge with no context. - Persist live.diagnosis into task._backendDiagnosis in localStorage so it survives modal close/reopen and page refresh - Persist live.cmd into task.payload._cmd for agent-spawned tasks so the crash report includes the actual command - After _renderRunningTab(), walk rendered cards and call _showDiagnosis() for any that have a stored _backendDiagnosis but no panel yet - In _renderTaskCard(), use _backendDiagnosis as a fallback when the client-side _terminalServeDiagnosis() finds nothing * test(cookbook): add coverage for _diagnose_serve_output error patterns 10 tests verifying the 16 serve-failure patterns: - CUDA OOM, port-in-use, vLLM missing, gated model - Traceback fallback fires without startup success marker - Traceback suppressed when server actually started - Clean/empty output returns None - trust-remote-code and no-GGUF patterns	2026-06-05 09:52:07 +01:00
pewdiepie-archdaemon	9112861d8e	cookbook agent debug loop: persistent log files, auto-adopt orphan tmux, Codex/Claude skill parity Three converging fixes so the chat agent + external Codex/Claude skills can actually debug a crashed serve instead of staring at a post-crash neofetch banner: * Serves now `tee` to /tmp/odysseus-tmux/SESSION.log on the host running them. Runner saves fds 3/4 before the tee and restores them right before `exec ${SHELL}`, so the post-crash interactive zsh banner does NOT pollute the log file. * `tail_serve_output` (chat agent) and `/api/codex/cookbook/output/{sid}` (Codex+Claude skills) both prefer the persistent log file over the tmux pane. Pane is fallback for sessions predating the tee runner. Default tail bumped 150 -> 400. * `list_served_models` "recent log" snippet seeks to the Traceback line instead of showing the last 6 lines (which was always the bash prompt). Cookbook auto-adoption sweep on `/api/cookbook/tasks/status`: every 20s (rate-limited) the cookbook SSHes each configured server, finds `serve-` / `cookbook-` tmux sessions running an actual model process (vllm/python/llama-server/etc., filtered via `pane_current_command`), and writes them into state.tasks. So when the agent falls back to raw ssh+tmux, the session appears in the Cookbook UI on the next poll. `serve_model` error path now reads `data["detail"]` in addition to `data["error"]` so the FastAPI HTTPException message ("Invalid characters in cmd") actually reaches the agent instead of being swallowed as a generic "Serve failed". Tool description updated to warn against `cd …`/`source …`/`&&` prefixes. Intent-without-action supervisor in agent_loop: when the model writes "Let me tail the output" / "I'll check the logs" / "Let me investigate" and ends the turn without emitting a tool call, the loop injects a sharp system nudge ("You said you would X — DO IT NOW") and continues. Capped at 2 nudges per chat so a model that genuinely cannot use the tool does not pin the loop. Codex/Claude skill parity: adds `/cookbook/cached`, `/cookbook/presets`, `/cookbook/preset/{name}`, `/cookbook/adopt` so external agents have the same surface as the chat agent. SKILL.md docs + odysseus_api.py wrapper updated for both bundles. `adopt_served_model` promoted to the always-on tool set so the agent has a documented fallback when serve_model rejects a cmd. Also various cookbook UI tweaks accumulated alongside the above (cookbook.js, cookbookRunning.js, cookbookServe.js, cookbook-diagnosis.js, settings.js, style.css).	2026-06-04 23:27:18 +09:00
hawktuahs	3d8c364689	[Bash] Fix Windows cookbook background tasks (#676 ) * Fix Windows cookbook background tasks * Add Windows Cookbook reliability follow-ups	2026-06-04 04:30:01 +01:00
Shaw	b10e6bc870	fix(cookbook): install llama-cpp-python[server] so llama.cpp serving works (#730 ) (#1338 ) The llama.cpp serve auto-install built a bare `llama-cpp-python` in the Linux source-build fallback and the Termux path, but the serve command runs `python3 -m llama_cpp.server`, which needs the `[server]` extra. Because the "already installed?" guard only checks `import llama_cpp` (a bare install satisfies it), the missing extra was never added, so serving crashed with `ModuleNotFoundError: No module named 'starlette_context'` (issue #730). - Request the `[server]` extra in both the Termux direct install and the Linux Python-bindings fallback (the Windows path already used `[server]`). - Shell-quote the package spec in `_pip_install_fallback_chain` via `shlex.quote` so the `[server]` brackets aren't treated as a bash glob; plain names unaffected. Tests: tests/test_cookbook_helpers.py gains extras-quoting coverage and a serve-runner regression guard.	2026-06-03 14:24:26 +09:00
Alexandre Teixeira	1c2ec288dd	Check cudart before llama.cpp CUDA build (#1466 )	2026-06-03 14:23:55 +09:00
lekt8	ffb8fd16bc	Disable pip cache for Cookbook dependency installs (off the home disk) (#1477 ) Cookbook dependency installs (vLLM and friends) build large wheels; pip's default cache lives under $HOME/.cache/pip, so on a small home filesystem the build dies mid-way with "[Errno 28] No space left on device" (issue #1219) and the dependency ends up "installed" but unusable (issue #1459). Add `--no-cache-dir` to the dependency pip-install command (the maintainer's suggested PIP_CACHE_DIR= workaround, made the default) via a small _pip_install_no_cache() helper applied at the install chokepoint. Consistent with the existing --no-cache-dir on the llama-cpp-python build. Idempotent; non-pip-install serve commands are untouched. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 14:23:49 +09:00
ghreprimand	6f001af2a3	Add a 'Rebuild llama.cpp' Cookbook action to force a fresh GPU build (#1787 ) The serve bootstrap builds llama-server from source only when it is missing from PATH, so a host that first compiled CPU-only (no nvcc present at build time) reuses that CPU-only binary on every later serve and never gets a GPU build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to force a rebuild. Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally or on the selected remote server) so the next serve recompiles and picks up CUDA/HIP if a toolchain is now present. It installs and downloads nothing. - routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth) - routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses the existing SSH plumbing for remote hosts) - static/js/cookbook.js: header button + handler honoring the deps server selector - tests: cover the command shape and a clean run on a fresh HOME Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to re-trigger the build). Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 13:28:19 +09:00
Michael Gerber	e392be0d65	fix: Cookbook local GGUF serving inside Docker (#1264 ) * fix: Cookbook local GGUF serving inside Docker Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint. Fixes included here: use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal Before this change, Docker local serves could fail in several ways: Cookbook pointed llama.cpp at the wrong GGUF path generated serve runner scripts crashed before launch with a shell syntax error successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1 This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration * test: add test for docker-local endpoint rewrites	2026-06-03 02:08:09 +09:00
red person	028a39b42c	Fix local Cookbook dependency installs in venvs (#1082 )	2026-06-02 22:39:02 +09:00
ooovenenoso	bd2fa82c1e	Cookbook: prefer ROCm for native llama.cpp bootstrap Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>	2026-06-02 20:59:44 +09:00
Ernest Hysa	996a2027dd	Cookbook: surface pip install failures in logs _pip_install_fallback_chain silently discarded pip stderr via 2>/dev/null on every attempt. When pip failed (network error, venv mismatch, disk full), the wrapper exited 0 and the Cookbook UI showed the download as running — the silent-failure mode from #354. Extract _pip_install_attempt() which wraps each pip invocation in a bash -c subshell that captures output to a temp file, prints tail -5 on failure, cleans up, and exits with pip's real exit code. This avoids the \| tail pipefail masking (the first blocker on #363) while surfacing the last 5 lines of pip output in the tmux log so users can see what went wrong. Both local wrapper and remote SSH runner use the same helper through _pip_install_fallback_chain, so the fix is symmetric.	2026-06-02 20:34:52 +09:00
spooky	8b3c0d8ad4	feat: select cached gguf artifacts for serve (#891 )	2026-06-02 12:32:40 +09:00
lolwuttav	c99193041a	fix(cookbook): default Ollama serve to loopback (#872 )	2026-06-02 12:27:04 +09:00
Tatlatat	9a1893760d	fix(cookbook): skip pip --user fallback inside virtualenvs (#388 ) (#889 ) The dependency-install fallback chain unconditionally ran 'pip install --user', which fails inside a virtualenv (and as root in LXC/containers) with 'Can not perform a --user install. User site-packages are not visible in this virtualenv.' — even though the function's docstring already noted --user is invalid in venvs. Guard the --user fallback with a venv check so it only runs outside a venv (where --user is actually valid for PEP-668 system Pythons). Derive the venv probe interpreter from the install command (python for 'pip', python3 for 'pip3'/'python3 -m pip') so the check runs in pip's own environment. System PEP-668 installs keep the --user fallback; venv/LXC-root installs no longer hit the --user error. Updated the unit test for the new chain. Closes #388	2026-06-02 12:23:20 +09:00
hawktuahs	a2f6183c4a	Fix cookbook pip installs in venvs (#723 )	2026-06-02 11:31:59 +09:00
pewdiepie-archdaemon	96618b01c0	Polish task UI slash commands and Ollama serving	2026-06-02 09:36:03 +09:00
pewdiepie-archdaemon	ab0a480f30	Show Ollama models in Cookbook Serve	2026-06-02 07:38:45 +09:00
ooovenenoso	5e47e69e99	Allow serving cached local llama.cpp models Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>	2026-06-01 23:10:08 +09:00
Yizreel Schwartz Sipahutar	42380a8693	Keep Cookbook POSIX paths stable on Windows hosts	2026-06-01 23:08:39 +09:00
pewdiepie-archdaemon	f2d55f8726	Fix cached GGUF model metadata in Cookbook Serve	2026-06-01 22:46:54 +09:00
pewdiepie-archdaemon	e5b927597e	Fix Cookbook serve exit code reporting	2026-06-01 22:41:25 +09:00
spooky	15822e91ff	fix: keep serve preflight errors visible (#398 )	2026-06-01 22:40:06 +09:00
John Chaplin	f1817fd560	Add macOS Apple Silicon Cookbook support * Add Apple Silicon (Metal) GPU detection and unified-memory fit tuning hardware.py detects Apple Silicon locally and over SSH, reporting backend=metal, the chip name, and a RAM-scaled fraction of unified memory as the usable GPU budget. fit.py gains an M1-M4 memory-bandwidth table for realistic tok/s and drops vLLM-only formats (AWQ/GPTQ/FP8) that can't be served on Metal. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 32ac81dbc680361463a088dae867d555d5a79c3b) * Generate macOS/Metal serve commands and surface the Metal GPU cookbook_routes.py adds a macOS serve path (Ollama, Metal-aware llama.cpp build using `sysctl hw.ncpu` instead of `nproc`, and a clear error if vLLM is attempted). The frontend defaults Metal serving to llama.cpp and offers llama.cpp/Ollama instead of vLLM/SGLang. The odysseus-cookbook CLI's `gpus` command reports the Metal GPU via sysctl/vm_stat. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 4ba01ce25d256ae032029898f361c824a34fcd4b) * Add launchd LaunchAgent for macOS (systemd equivalent) com.odysseus.ui.plist + install-service-macos.sh run Odysseus at login and restart on crash, the macOS counterpart to odysseus-ui.service. The installer auto-fills paths from the venv, so there's no hand-editing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 3d4b6b2c7b8b31af32201ed278115df9a559dea9) * Document macOS install (brew, Ollama, AirPlay port, launchd) README + setup.py cover the Homebrew / Apple Silicon path: brew install python@3.11 tmux ollama, Metal serving via Ollama/llama.cpp, the launchd service, and the macOS AirPlay Receiver conflict on ports 7000/5000. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 8dc9a3578a1726f070ed9f75c0958ae291a6d966) * Add downloadable macOS launcher app builder build-macos-app.sh generates dist/Odysseus.app and a drag-to-Applications dist/Odysseus.dmg. The app starts the local server from this repo's venv and opens the UI in a chrome-less app window (Chromium --app mode, falling back to the default browser). It's a launcher wrapper — it drives the venv rather than bundling Python — so the install path is baked in at build time. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 7927940c3810ee34640803b198d334a6ac93474d) * Harden macOS Cookbook support: hide MLX, fix Metal build cache Builds on the adopted PR #213 macOS/Metal work with two fixes and tests: - fit.py: always drop MLX-quantized models. Odysseus only generates serve commands for llama.cpp/Ollama (Metal) and vLLM/SGLang (CUDA); MLX needs the mlx_lm runtime and the catalog's MLX repos ship no GGUF alternative, so they were surfaced on Apple Silicon but could never be served. - cookbook_routes.py (macOS branch only): `rm -rf build` before configure so a poisoned CMakeCache from a prior failed CUDA attempt can't make every later build fail; explicit -DCMAKE_BUILD_TYPE=Release; a clear "brew install cmake" hint if cmake is missing. Linux/CUDA path unchanged. - tests/test_hwfit_macos.py: MLX hidden on metal, MLX still hidden on CUDA (regression guard), Metal detection on Apple Silicon, and skipped on Linux/Intel (proves non-macOS detection is untouched). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Propagate unified_memory flag and document macOS GPU/Docker caveat - hardware.py: detect_system now carries the unified_memory flag from GPU detection into the system dict (it was set by _detect_apple_silicon / AMD-APU detection but dropped during result assembly, so the API always reported null). Lets callers distinguish unified from discrete VRAM. - README: prominent warning that Docker on Apple Silicon can't reach the Metal GPU (runs a Linux VM) — Cookbook must run natively for GPU serving; fix stale text that said Cookbook recommends MLX models (now hidden as unservable). - test: detect_system propagates unified_memory. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Put Odysseus's venv bin on PATH for cookbook runners Native (non-Docker) installs run from a virtualenv whose bin holds the `hf` CLI and `python3` the cookbook download/serve tmux scripts shell out to. Those scripts start in a fresh login shell with the venv NOT activated, so on a native macOS install `hf download` failed with "hf: command not found" — and the `pip --user` self-heal missed because macOS has no bare `pip` command. - cookbook_helpers.py: _local_tooling_path_export() — pure helper returning a PATH export for the running interpreter's bin dir (escaped for double quotes). - cookbook_routes.py: download + serve runners prepend that dir on local runs (gated off SSH/Windows); swap the `pip` install fallbacks to `python3 -m pip`. - tests: helper output for normal and spaced paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Document macOS llama.cpp serving prerequisites Clarify the two serving paths on Apple Silicon: the recommended zero-build route (brew install llama.cpp ships a Metal llama-server Cookbook finds on PATH), and the from-source fallback, which requires cmake + Xcode Command Line Tools. Without those the build is skipped and serving silently degrades to a slow CPU build, so new users now know to install them (or use the prebuilt) up front. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Recommend only GGUF-servable models on Metal Apple Silicon's only serving engines are llama.cpp and Ollama, both GGUF-only (vLLM/SGLang are CUDA/ROCm and don't run on macOS). The catalog tags raw safetensors repos with a default Q4_K_M quant, so the fit-ranking was recommending ~397/501 models that have no GGUF and fail to serve on Metal with "No GGUF found" (e.g. microsoft/Phi-mini-MoE-instruct). Drop any model without a real GGUF (is_gguf/gguf_sources) on Apple Silicon — subsumes the previous AWQ/GPTQ/FP8 special-case into one rule. On CUDA these stay visible since vLLM serves safetensors directly. Metal recommendations go 501 -> 104, all actually servable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Remove macOS launchd LaunchAgent (cherry-picked extra) Drop the launchd service from the PR #213 cherry-picks: the install-service-macos.sh installer, the com.odysseus.ui.plist template, and the README section documenting them. Tangential to the core Cookbook/Metal support and not wanted. The build-macos-app.sh launcher is kept. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Add one-command macOS quick start (start-macos.sh) Running Odysseus natively on a Mac previously meant ~7 manual terminal steps (brew deps, venv, activate, pip, setup.py, uvicorn with the right port) — not friendly for a generic macOS user, and the native run is required because Docker on macOS can't reach the Metal GPU. - start-macos.sh: installs Homebrew deps (python@3.11, tmux, prebuilt Metal llama.cpp), creates the venv, installs requirements, runs setup, and launches on a non-AirPlay port (7860). Idempotent; re-run to start again. - README: the Apple Silicon section now leads with this one-command quick start and the clickable .app, with engine/port/manual details folded into a collapsible block. Added a pointer at the top of the manual-install section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * macOS quick start: auto-open browser when ready The "open this URL" line scrolled out of view as uvicorn kept logging after it, so users missed it. Now start-macos.sh waits (in the background) until the server accepts connections, prints a boxed "ready" banner at that point (i.e. after the startup burst, not before), and opens the URL in the default browser automatically. Skippable with ODYSSEUS_NO_OPEN=1 for headless/SSH use. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Don't assume/force a specific Python version on macOS The README claimed "system Python is 3.9" — a machine-specific generalization that's often wrong (macOS ships no recent Python by default; many users already have 3.11+). Make it generic, and make start-macos.sh detect an existing Python 3.11+ and use it, only installing python@3.11 when none is found instead of forcing it on top of the user's Python. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Align start-macos.sh venv path with build-macos-app.sh start-macos.sh created the environment in .venv/, but build-macos-app.sh and the manual install steps use venv/ — so the clickable .app wouldn't reuse the quick-start's environment and would rebuild a second one. Use venv/ everywhere. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * README: state clearly that MLX is unsupported on Apple Silicon Odysseus has no mlx_lm runtime; it serves GGUF (llama.cpp/Ollama) and CUDA (vLLM/SGLang) only. MLX-only models can't run on a Mac and are hidden from Cookbook — make that explicit in both the quick start and the details. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * start-macos.sh: build the venv with an arm64 Python on Apple Silicon A clean-room run surfaced this: with a universal2/x86 Python (e.g. the python.org installer under /usr/local), the venv's compiled extensions install as arm64 but get loaded as x86_64 when launched from the .app bundle, so it crashes with "incompatible architecture (have arm64, need x86_64)". The terminal run happened to work only because a universal binary defaults to arm64 there. On Apple Silicon, look only under /opt/homebrew (arm64-only) for the build Python, and install Homebrew's python@3.11 if none is present — so the venv is arm64-only and launches correctly from both the terminal and the .app. Intel and non-mac paths are unchanged. Verified end-to-end in a clean clone: .app now boots on Metal with no arch error. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Address dev-exp review: macOS setup robustness + doc/UX fixes From the voltagent dev-exp review of the branch: - README: fix broken anchor links (the em-dash heading produced a slug the links didn't match); simplify the heading to a stable slug. - cookbook_routes.py: add /opt/homebrew/bin and /usr/local/bin to the serve PATH so a brew-installed llama-server/ollama is found instead of falling back to a slow source build. - start-macos.sh: guard against an empty Python path; fail fast with a clear message on port-in-use; ERR trap with a "safe to re-run" message; show pip progress (drop --quiet on the slow requirements install); stop the background browser-opener cleanly on exit/Ctrl+C (no orphaned poller). - setup.py: bind hint to 127.0.0.1; suppress the manual run-hint when launched by start-macos.sh (ODYSSEUS_SKIP_RUN_HINT) so the URL isn't contradictory. - build-macos-app.sh: the .app only opens the browser once the server is actually ready (not after the readiness timeout). - cookbookServe.js: drop "Diffusers" from the Metal backend picker — diffusion_server.py is CUDA-only, so it was an unservable option on macOS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: yunggilja <yunggilja@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 14:59:19 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

25 Commits