odysseus

Author	SHA1	Message	Date
red person	e68d0448b8	Parse all AMD GPU check args (#1586 )	2026-06-03 08:56:48 +09:00
red person	db3a5c17b0	Reject backup output inside data dir (#1587 )	2026-06-03 08:38:27 +09:00
Afonso Coutinho	19b6cbac12	fix: skills CLI summary crashes on a non-string description (#1595 )	2026-06-03 08:37:05 +09:00
Afonso Coutinho	258fe455eb	fix: research CLI summary crashes on a non-string query (#1596 )	2026-06-03 08:36:57 +09:00
Afonso Coutinho	667c663668	fix: gallery CLI image serialization crashes on a non-string prompt (#1598 )	2026-06-03 08:36:51 +09:00
Afonso Coutinho	0d88c9989e	fix: mcp CLI _serialize crashes when stored env JSON is a list (#1609 )	2026-06-03 08:35:09 +09:00
red person	abbc073429	Reject invalid preset CLI stores (#1395 )	2026-06-03 03:59:05 +09:00
Afonso Coutinho	8852c7ea4a	fix: claim_ownerless actually claims ownerless documents (was a no-op self-update) (#1288 )	2026-06-03 01:38:38 +09:00
spooky	18a445ba22	docs: add AMD Docker GPU preflight (#1168 )	2026-06-02 22:54:08 +09:00
tanmayraut45	4e440a9fd5	Hwfit: estimate params from config.json fallback `add_hwfit_models.py` infers `parameter_count` and `parameters_raw` by regexing the HF repo name for a `<num>B` token, optionally with an `-A<num>B` MoE active-param suffix. Repos that don't encode a size in their name at all (e.g. `zai-org/GLM-4.5`, where the "4.5" is a version not a parameter count) fall through to the safetensors element-count path. That path works for unquantized FP16 / BF16 repos but is brittle in two cases the catalog hits often: 1. Author-bulk runs (`AUTHORS = ["cyankiwi"]`) pull pre-quantized AWQ / GPTQ / MLX repos. The safetensors metadata stores the packed I32 tensors and a per-dtype `parameters` map, which the script unpacks via a per-quant pack factor. When the upload doesn't populate that map (older repos, custom shards), `st.total` is used raw and the parameter count is off by 4-8x. 2. Repos where the safetensors block is absent from `model_info()` entirely. The current code returns `None` and silently drops the model, which then has to be added to `EXTRA_REPOS` by hand with a literal `parameter_count` string. Both are exactly what the issue calls out — the regex / safetensors combo can't size GLM-4.5 by itself because the name has no `<num>B` and the upstream repo's safetensors block doesn't carry a usable param total either. Add a config.json fallback in front of the safetensors path: - `_fetch_config_json(repo_id)` downloads `config.json` via `hf_hub_download` (so the standard HF on-disk cache handles deduplication across runs, no extra cache layer needed). Network / 404 / gated-repo errors return `None` and the caller proceeds to the safetensors fallback. An in-process `_CONFIG_CACHE` dedupes the base-model vs. source-repo lookups within a single run. - `_params_from_config(cfg)` first honours explicit `num_parameters` / `n_params` / `total_params` fields when present. Otherwise it sums embeddings + attention (GQA-aware via `num_key_value_heads` and `head_dim`) + dense MLP (`3 * hidden_size * intermediate_size`, covering SwiGLU / GeGLU). For MoE configs it picks up both naming conventions in the wild — `num_experts` / `num_experts_per_tok` (Qwen3-MoE) and `n_routed_experts` / `n_shared_experts` (GLM-4-MoE, DeepSeek-V3) — uses `moe_intermediate_size`, and respects `first_k_dense_replace` so the first N layers stay dense. Active parameters come out as `num_experts_per_tok + n_shared_experts` of the routed experts, which matches how each architecture reports its active count. - In `_entry_from_modelinfo`, try config.json on the source repo first (works for unquantized models) and then on the `base_model:` parent (covers AWQ / GPTQ children whose own config is just a quantization manifest). Both lookups run only when regex + override + base_model tag all failed, so the normal author-bulk run still resolves sizes from names without touching the Hub. Spot-checks against the three architecture families this script actually pulls — within ~5% of the documented param counts, which is well inside the `parameter_count` rounding (one decimal of "B") and the `min_vram_gb` downstream bucket: Qwen2.5-7B-Instruct 7.62B (HF card: 7.6B) Qwen3-30B-A3B 30.5B / 3.34B active (card: 30.5B / 3.3B) GLM-4.5 352.7B / 33.6B active (card: 355B / 32B) The safetensors path is unchanged and remains the last resort, so repos with neither a parsable name nor a fetchable config.json behave exactly as before. Closes #955.	2026-06-02 20:33:25 +09:00
spooky	cd4f496cb4	Fix native Cookbook quant classification	2026-06-02 13:07:20 +09:00
Alexandre Teixeira	8455b88643	Improve Docker GPU setup diagnostics (#705 ) * Improve Docker GPU setup diagnostics Add a Docker GPU preflight script for NVIDIA users. The script is read-only by default, checks host NVIDIA drivers, Docker availability, and container GPU passthrough, and prints actionable next steps. Add explicit opt-in modes to print install commands, install NVIDIA Container Toolkit on Ubuntu/Debian, and enable the NVIDIA Compose overlay in .env after passthrough is verified. Document common NVIDIA Docker failure modes, ignore generated .env backups, and clarify that Cookbook can only detect GPUs exposed to the Odysseus container. * Clarify Docker GPU diagnostic limits	2026-06-02 12:30:40 +09:00
pewdiepie-archdaemon	966b53df77	Improve Cookbook serve diagnostics and recommendations	2026-06-02 12:15:47 +09:00
ghreprimand	491a8a5480	Harden backup restore tar extraction Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 05:55:03 +09:00
Sirsyorrz	9955f5bc95	Fix VRAM estimates for pre-quantized HF repos The Cookbook fit scanner was reporting impossibly low VRAM requirements for some pre-quantized models — e.g. cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit shown as 7.1 GB ('perfect' on a 12 GB card) when the real load is ~40 GB. Root cause is in the catalog builder. When _entry_from_modelinfo falls back to safetensors metadata for the parameter count, it stored safetensors.total directly. For pre-quantized repos that figure reflects packed element counts: AWQ/GPTQ-Int4 pack 8x 4-bit weights into one I32, AWQ-8bit/GPTQ-Int8/FP8 pack 4x. The catalog therefore recorded ~1/8 of the real parameter count, and min_vram_gb = packed * bpp double-applied the quantization. Fix the safetensors fallback: * prefer the per-dtype parameters dict when available and unpack only the I32/I64 entries (the F16/BF16 scale/zero tensors and embeddings are already at their real element counts) * fall back to total * pack_factor when only total is exposed Patch the catalog entries that were affected by the old fallback so the fit ratings reflect reality without waiting for a full catalog rebuild: * cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit 11.4B -> 79.7B (40.8 GB VRAM) * stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ 4.6B -> 30.5B * stelterlab/NVIDIA-Nemotron-3-Nano-30B-A3B-AWQ 5.1B -> 30.5B * warshanks/Qwen3-8B-abliterated-AWQ 2.2B -> 8.2B * QuantTrio/sarvam-30b-AWQ 7B -> 30B * QuantTrio/sarvam-105b-AWQ 19B -> 105B Closes #377.	2026-06-01 18:32:58 +09:00
pewdiepie-archdaemon	0888a3b3e6	Add native Windows compatibility layer	2026-06-01 15:09:47 +09:00
John Chaplin	f1817fd560	Add macOS Apple Silicon Cookbook support * Add Apple Silicon (Metal) GPU detection and unified-memory fit tuning hardware.py detects Apple Silicon locally and over SSH, reporting backend=metal, the chip name, and a RAM-scaled fraction of unified memory as the usable GPU budget. fit.py gains an M1-M4 memory-bandwidth table for realistic tok/s and drops vLLM-only formats (AWQ/GPTQ/FP8) that can't be served on Metal. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 32ac81dbc680361463a088dae867d555d5a79c3b) * Generate macOS/Metal serve commands and surface the Metal GPU cookbook_routes.py adds a macOS serve path (Ollama, Metal-aware llama.cpp build using `sysctl hw.ncpu` instead of `nproc`, and a clear error if vLLM is attempted). The frontend defaults Metal serving to llama.cpp and offers llama.cpp/Ollama instead of vLLM/SGLang. The odysseus-cookbook CLI's `gpus` command reports the Metal GPU via sysctl/vm_stat. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 4ba01ce25d256ae032029898f361c824a34fcd4b) * Add launchd LaunchAgent for macOS (systemd equivalent) com.odysseus.ui.plist + install-service-macos.sh run Odysseus at login and restart on crash, the macOS counterpart to odysseus-ui.service. The installer auto-fills paths from the venv, so there's no hand-editing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 3d4b6b2c7b8b31af32201ed278115df9a559dea9) * Document macOS install (brew, Ollama, AirPlay port, launchd) README + setup.py cover the Homebrew / Apple Silicon path: brew install python@3.11 tmux ollama, Metal serving via Ollama/llama.cpp, the launchd service, and the macOS AirPlay Receiver conflict on ports 7000/5000. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 8dc9a3578a1726f070ed9f75c0958ae291a6d966) * Add downloadable macOS launcher app builder build-macos-app.sh generates dist/Odysseus.app and a drag-to-Applications dist/Odysseus.dmg. The app starts the local server from this repo's venv and opens the UI in a chrome-less app window (Chromium --app mode, falling back to the default browser). It's a launcher wrapper — it drives the venv rather than bundling Python — so the install path is baked in at build time. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 7927940c3810ee34640803b198d334a6ac93474d) * Harden macOS Cookbook support: hide MLX, fix Metal build cache Builds on the adopted PR #213 macOS/Metal work with two fixes and tests: - fit.py: always drop MLX-quantized models. Odysseus only generates serve commands for llama.cpp/Ollama (Metal) and vLLM/SGLang (CUDA); MLX needs the mlx_lm runtime and the catalog's MLX repos ship no GGUF alternative, so they were surfaced on Apple Silicon but could never be served. - cookbook_routes.py (macOS branch only): `rm -rf build` before configure so a poisoned CMakeCache from a prior failed CUDA attempt can't make every later build fail; explicit -DCMAKE_BUILD_TYPE=Release; a clear "brew install cmake" hint if cmake is missing. Linux/CUDA path unchanged. - tests/test_hwfit_macos.py: MLX hidden on metal, MLX still hidden on CUDA (regression guard), Metal detection on Apple Silicon, and skipped on Linux/Intel (proves non-macOS detection is untouched). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Propagate unified_memory flag and document macOS GPU/Docker caveat - hardware.py: detect_system now carries the unified_memory flag from GPU detection into the system dict (it was set by _detect_apple_silicon / AMD-APU detection but dropped during result assembly, so the API always reported null). Lets callers distinguish unified from discrete VRAM. - README: prominent warning that Docker on Apple Silicon can't reach the Metal GPU (runs a Linux VM) — Cookbook must run natively for GPU serving; fix stale text that said Cookbook recommends MLX models (now hidden as unservable). - test: detect_system propagates unified_memory. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Put Odysseus's venv bin on PATH for cookbook runners Native (non-Docker) installs run from a virtualenv whose bin holds the `hf` CLI and `python3` the cookbook download/serve tmux scripts shell out to. Those scripts start in a fresh login shell with the venv NOT activated, so on a native macOS install `hf download` failed with "hf: command not found" — and the `pip --user` self-heal missed because macOS has no bare `pip` command. - cookbook_helpers.py: _local_tooling_path_export() — pure helper returning a PATH export for the running interpreter's bin dir (escaped for double quotes). - cookbook_routes.py: download + serve runners prepend that dir on local runs (gated off SSH/Windows); swap the `pip` install fallbacks to `python3 -m pip`. - tests: helper output for normal and spaced paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Document macOS llama.cpp serving prerequisites Clarify the two serving paths on Apple Silicon: the recommended zero-build route (brew install llama.cpp ships a Metal llama-server Cookbook finds on PATH), and the from-source fallback, which requires cmake + Xcode Command Line Tools. Without those the build is skipped and serving silently degrades to a slow CPU build, so new users now know to install them (or use the prebuilt) up front. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Recommend only GGUF-servable models on Metal Apple Silicon's only serving engines are llama.cpp and Ollama, both GGUF-only (vLLM/SGLang are CUDA/ROCm and don't run on macOS). The catalog tags raw safetensors repos with a default Q4_K_M quant, so the fit-ranking was recommending ~397/501 models that have no GGUF and fail to serve on Metal with "No GGUF found" (e.g. microsoft/Phi-mini-MoE-instruct). Drop any model without a real GGUF (is_gguf/gguf_sources) on Apple Silicon — subsumes the previous AWQ/GPTQ/FP8 special-case into one rule. On CUDA these stay visible since vLLM serves safetensors directly. Metal recommendations go 501 -> 104, all actually servable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Remove macOS launchd LaunchAgent (cherry-picked extra) Drop the launchd service from the PR #213 cherry-picks: the install-service-macos.sh installer, the com.odysseus.ui.plist template, and the README section documenting them. Tangential to the core Cookbook/Metal support and not wanted. The build-macos-app.sh launcher is kept. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Add one-command macOS quick start (start-macos.sh) Running Odysseus natively on a Mac previously meant ~7 manual terminal steps (brew deps, venv, activate, pip, setup.py, uvicorn with the right port) — not friendly for a generic macOS user, and the native run is required because Docker on macOS can't reach the Metal GPU. - start-macos.sh: installs Homebrew deps (python@3.11, tmux, prebuilt Metal llama.cpp), creates the venv, installs requirements, runs setup, and launches on a non-AirPlay port (7860). Idempotent; re-run to start again. - README: the Apple Silicon section now leads with this one-command quick start and the clickable .app, with engine/port/manual details folded into a collapsible block. Added a pointer at the top of the manual-install section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * macOS quick start: auto-open browser when ready The "open this URL" line scrolled out of view as uvicorn kept logging after it, so users missed it. Now start-macos.sh waits (in the background) until the server accepts connections, prints a boxed "ready" banner at that point (i.e. after the startup burst, not before), and opens the URL in the default browser automatically. Skippable with ODYSSEUS_NO_OPEN=1 for headless/SSH use. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Don't assume/force a specific Python version on macOS The README claimed "system Python is 3.9" — a machine-specific generalization that's often wrong (macOS ships no recent Python by default; many users already have 3.11+). Make it generic, and make start-macos.sh detect an existing Python 3.11+ and use it, only installing python@3.11 when none is found instead of forcing it on top of the user's Python. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Align start-macos.sh venv path with build-macos-app.sh start-macos.sh created the environment in .venv/, but build-macos-app.sh and the manual install steps use venv/ — so the clickable .app wouldn't reuse the quick-start's environment and would rebuild a second one. Use venv/ everywhere. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * README: state clearly that MLX is unsupported on Apple Silicon Odysseus has no mlx_lm runtime; it serves GGUF (llama.cpp/Ollama) and CUDA (vLLM/SGLang) only. MLX-only models can't run on a Mac and are hidden from Cookbook — make that explicit in both the quick start and the details. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * start-macos.sh: build the venv with an arm64 Python on Apple Silicon A clean-room run surfaced this: with a universal2/x86 Python (e.g. the python.org installer under /usr/local), the venv's compiled extensions install as arm64 but get loaded as x86_64 when launched from the .app bundle, so it crashes with "incompatible architecture (have arm64, need x86_64)". The terminal run happened to work only because a universal binary defaults to arm64 there. On Apple Silicon, look only under /opt/homebrew (arm64-only) for the build Python, and install Homebrew's python@3.11 if none is present — so the venv is arm64-only and launches correctly from both the terminal and the .app. Intel and non-mac paths are unchanged. Verified end-to-end in a clean clone: .app now boots on Metal with no arch error. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Address dev-exp review: macOS setup robustness + doc/UX fixes From the voltagent dev-exp review of the branch: - README: fix broken anchor links (the em-dash heading produced a slug the links didn't match); simplify the heading to a stable slug. - cookbook_routes.py: add /opt/homebrew/bin and /usr/local/bin to the serve PATH so a brew-installed llama-server/ollama is found instead of falling back to a slow source build. - start-macos.sh: guard against an empty Python path; fail fast with a clear message on port-in-use; ERR trap with a "safe to re-run" message; show pip progress (drop --quiet on the slow requirements install); stop the background browser-opener cleanly on exit/Ctrl+C (no orphaned poller). - setup.py: bind hint to 127.0.0.1; suppress the manual run-hint when launched by start-macos.sh (ODYSSEUS_SKIP_RUN_HINT) so the URL isn't contradictory. - build-macos-app.sh: the .app only opens the browser once the server is actually ready (not after the readiness timeout). - cookbookServe.js: drop "Diffusers" from the Metal backend picker — diffusion_server.py is CUDA-only, so it was an unservable option on macOS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: yunggilja <yunggilja@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-01 14:59:19 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

18 Commits