odysseus

Author	SHA1	Message	Date
Alexandre Teixeira	ea1079e1df	docs: fix stale documentation references (#1769 )	2026-06-03 13:23:21 +09:00
Lucas Daniel	12fd8b6570	fix(group): show all user-created personas in the participant selector (#1770 ) _getCharacterList() had two bugs that silently dropped every user-created persona from the group participant picker: 1. The /api/presets/templates endpoint returns a JSON array directly, but the code read `data.templates` (always undefined). The forEach over `data.templates \|\| []` iterated over an empty array every time, so no user templates were ever added. 2. Even if the array had been read correctly, the `t.isCharacter` guard would have filtered them all out — user templates are saved by presets.js without that flag, which is only present on built-in PROMPT_TEMPLATES entries. Fix: accept both the direct-array and the {templates:[]} shapes, drop the isCharacter guard (user_templates are personas by definition), and use the correct field name (system_prompt, not prompt) so the character prompt actually reaches the group chat. Fixes #1656	2026-06-03 13:23:14 +09:00
Afonso Coutinho	694647375c	fix: signature delimiter fold misses self-closing <br/> breaks (#1774 )	2026-06-03 13:22:46 +09:00
Lucas Daniel	1d99429ba0	fix(cookbook): prevent auto-retry from restarting user-stopped downloads (#1778 ) Two related bugs in the Cookbook task lifecycle: 1. "Stop all" fired kills via .click() inside a synchronous forEach but showed the success toast immediately after — the toast appeared before any of the async kill requests had been sent, giving the user false confidence the tasks were stopped. 2. The download auto-retry logic (triggered when DOWNLOAD_FAILED appears in the task output) had no way to distinguish a network interruption from a deliberate user stop. A download stopped via "Stop all" or the individual Stop button could be silently restarted up to two times by the background monitor. Fix: persist _userStopped: true to localStorage at the moment the user clicks Stop (individually) or Stop all. The auto-retry guard checks this flag before relaunching the download. The flag is written BEFORE the kill requests fire so there is no window where the monitor can race. Fixes #1458	2026-06-03 13:22:39 +09:00
Afonso Coutinho	c3bf32d1b1	fix: monthly schedule label shows 21th/22th/31th (ordinal suffix for days >20) (#1577 )	2026-06-03 08:57:47 +09:00
Paulo Victor Cordeiro	bd0845e081	fix: guard sp.destroy() in _loadScheduled against null spinner (#1495 ) When the scheduled folder is opened with cached data, sp is null (the loading spinner is skipped). _loadScheduled receives null and calls sp.destroy() unconditionally, crashing with TypeError.	2026-06-03 08:12:47 +09:00
Paulo Victor Cordeiro	dc3421c34e	fix: return sorted model list on first call in group chat (#1484 ) Both _getModels() and getAllModels() store the sorted copy in a cache variable but return the original unsorted array on first invocation. Subsequent calls return the cache (sorted), causing inconsistent model picker ordering on first render.	2026-06-03 08:12:37 +09:00
lekt8	0e6cbd8315	Drop GPU-only flags from the CPU-only (-ngl 0) serve command (#1433 ) A CPU-only llama.cpp serve config still emitted --flash-attn on and exported GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 (independent toggles, often left on by an Auto profile), so the command mixed "zero GPU layers" with CUDA/flash-attn and failed to start (issue #1291). Gate both on a _cpuOnly check (ngl == 0). GPU serving is unchanged — the gate only affects the ngl=0 path. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 04:26:15 +09:00
lekt8	57abe69173	Let the output "x" delete work when no model/session exists (#1431 ) deleteMessage() bailed at `if (!sessionId) return;`, so the "x" on an output shown before a model/API was selected did nothing — there's no session yet (issue #1428). The session id is only needed for the server-side delete; without one (or with no persisted message ids) we now fall through to removing the DOM, so the "x" always at least dismisses the bubble. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 04:20:48 +09:00
lekt8	8450cee02a	Surface upload failures instead of silently dropping the files (#1425 ) uploadPending() read `data.files` from /api/upload without checking `res.ok`, so a non-OK response (429 rate limit, 413 too large, …) was swallowed: the pending files vanished and the chat sent with no attachments and no feedback — part of why the model "didn't even see them" in #1346. Check res.ok; on failure show the server's reason via a toast and keep the pending files so the attach strip re-renders for a retry (matching the existing "restored on error" comment that the code never actually honored). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 04:12:23 +09:00
red person	1ecd113808	Keep presets loading with bad local state (#1417 )	2026-06-03 04:09:28 +09:00
lekt8	4d1829add0	Clear the composer draft when entering the New Chat / welcome state (#1408 ) Clicking "New chat" (the brand/welcome navigation path) left the previous session's unsent draft in the composer (issue #1343). The direct model-picker path (createDirectChat) already cleared it, but the welcome path did not. Clear `#message` in chatRenderer.showWelcomeScreen() — the shared entry point for that state — resetting its autosized height and dispatching an `input` event so the send button / autosize listeners update. Switching between existing sessions loads them directly and does not call showWelcomeScreen, so genuine drafts are not erased. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 04:07:31 +09:00
red person	5fd71f68e8	Keep group chat session cache loading (#1418 )	2026-06-03 04:05:40 +09:00
lekt8	77b63ed942	Keep Cookbook download-failure toasts visible long enough to read (#1412 ) The Cookbook download path showed its error toasts with the default ~1.2s duration, so an actionable message like "tmux is required for Cookbook background downloads/serves … install it with your OS package manager" vanished before it could be read (issue #1355). The serve path already uses multi-second durations. Give the three "Download failed" toasts a 9s duration to match. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 03:48:25 +09:00
LittleLlama	50a486b608	fix(cookbook): add NVFP4 to quantization picker dropdown (#1378 ) Fixes #1328	2026-06-03 03:26:43 +09:00
Michael Gerber	e392be0d65	fix: Cookbook local GGUF serving inside Docker (#1264 ) * fix: Cookbook local GGUF serving inside Docker Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint. Fixes included here: use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal Before this change, Docker local serves could fail in several ways: Cookbook pointed llama.cpp at the wrong GGUF path generated serve runner scripts crashed before launch with a shell syntax error successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1 This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration * test: add test for docker-local endpoint rewrites	2026-06-03 02:08:09 +09:00
Paulo Victor Cordeiro	5452bc96b1	fix: markdown table renders separator row as visible data (#1252 ) * fix: markdown table renders separator row as visible data The alignment separator (\|---\|---\|) at row index 1 was rendered as a <td> row with dashes as cell content. Skip it and only open <tbody> at that point, so tables render as header + data without the garbage separator row in between. * test: add regression test for table separator row rendering Verifies that the markdown table renderer skips the separator row (\|---\|---\|) instead of rendering it as a visible data row. Also updates the test harness to handle the splitTableRow import.	2026-06-03 01:59:05 +09:00
Paulo Victor Cordeiro	9c68ceafeb	fix: use cached blob URL in _createChip to prevent memory leak (#1266 ) _createChip called URL.createObjectURL directly, bypassing the _getPreviewUrl/_revokePreviewUrl cache. Each re-render of the attachment strip leaked blob URLs that were never revoked.	2026-06-03 01:55:59 +09:00
Afonso Coutinho	3137ee4946	fix: theme color parsing breaks on #rgb shorthand hex (#1213 ) * refactor: add pure hexToRgb helper that handles #rgb shorthand * fix: handle #rgb shorthand hex in theme color parsing * test: hexToRgb expands shorthand and rejects invalid input	2026-06-03 00:30:03 +09:00
ghreprimand	1fda906407	Fix Cookbook container-local model endpoints (#1223 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 00:09:48 +09:00
spooky	37f5635f8f	feat: show serve runtime readiness (#1209 )	2026-06-03 00:01:00 +09:00
ghreprimand	e72b9a8a95	Fix stale deleted sessions in sidebar (#1203 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 23:52:22 +09:00
red person	258e6fc0d4	fix(ui): allow manual prompt bar resize (#1201 )	2026-06-02 23:43:53 +09:00
red person	69ab350919	fix(ui): keep minimized windows above composer (#1197 )	2026-06-02 23:31:09 +09:00
Zarl-prog	b89141679f	fix(cookbook): scroll serve panel into view when expanded (#1180 ) (#1191 )	2026-06-02 23:21:35 +09:00
spooky	f667667da3	fix: distinguish external cookbook runtimes (#1188 )	2026-06-02 23:20:00 +09:00
spooky	5b87e69221	feat: add vllm kv cache dtype option (#1185 )	2026-06-02 23:17:16 +09:00
ghreprimand	7b43fa9372	Improve calendar event text contrast (#1184 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 23:14:52 +09:00
pewdiepie-archdaemon	ff93a6c63b	Polish email and cookbook flows	2026-06-02 22:42:07 +09:00
Afonso Coutinho	15a2662119	fix: markdown tables drop empty cells and misalign columns (#1164 ) * refactor: extract splitTableRow helper for markdown tables * fix: keep empty interior cells in markdown tables to preserve columns * test: splitTableRow keeps empty interior cells	2026-06-02 22:41:27 +09:00
Kenny Van de Maele	68efa8ee53	Fix docked-modal close: chat stays offset / reopen overlaps / no animation (#1158 ) Docking a modal to a window edge pushes the chat aside (body padding via right-dock-active + --right-dock-w). Three problems on close/reopen: 1. Chat stayed offset after closing a docked modal. The close-watcher only reacted to the `.hidden` class or DOM removal, but the draggable modals (calendar, plan, workspace, document, …) close via inline `display:none`. Watch the `style` attribute too and treat `display:none` as closed. 2. Reopening a previously-docked singleton modal floated it off to the side, overlapping the chat. The reused element kept its docked inline geometry. Clear the content's inline position/size on close so it reopens at its CSS default (centered). 3. Undock wasn't animated. The transition lived on `.right/left-dock-active`, so removing the class dropped the transition with it and padding snapped to 0. Move the transition to the base `body` so the push animates both ways. Files: static/js/modalSnap.js, static/style.css. Checks: node --check static/js/modalSnap.js; verified in-browser (dock → close → chat animates back; reopen → centered, no overlap).	2026-06-02 22:38:20 +09:00
Robin Fröhlich	096468a29f	fix: persist and display multimodal messages (image/audio attachments) (#1159 ) Multimodal content (list of {type, text/image_url} blocks) couldn't be stored in the DB Text column, causing silent persist failures. On reload the frontend fell back to String() on the array, rendering [object Object],[object Object] in the chat. - Serialize list content as JSON in _persist_message() - Deserialize back to list in _db_to_session() via _parse_msg_content() - Extract text parts from multimodal arrays in sessions.js instead of String() coercion	2026-06-02 22:37:48 +09:00
3ASiC	521848da75	fix(ui): don't submit chat message on Enter during IME composition (#1091 ) CJK and other IME users confirm a candidate from the input-method popup by pressing Enter. The chat composer and the in-place message editor each bind a keydown handler that treats Enter (without Shift) as "submit", but they did not exclude the composition state. Pressing Enter to accept an IME candidate therefore sent the half-composed text (e.g. a stray "ce's") instead of just confirming the candidate. These textareas intentionally hijack Enter to submit (Enter sends, Shift+Enter inserts a newline), which bypasses the browser's native form submission and the IME guard that comes with it, so the guard has to be re-added explicitly. Add '&& !e.isComposing' to the three Enter-to-submit handlers: static/app.js (the main composer's button-submit path and its send/new-chat path) and static/js/chat.js (the editor for an already-sent message). Normal Enter (isComposing false) still submits; Shift+Enter still inserts a newline. Tested: node --check on both files; manually verified with a Chinese IME that pressing Enter to pick a candidate no longer sends, and a message is sent only after composition ends.	2026-06-02 22:32:50 +09:00
Nikita Rozanov	119075f368	Research: add configurable run timeout Surfaces the research_run_timeout_seconds setting (added in #783) in Settings → Research as a "Max Time" field, and lets 0 disable the wall-clock cap entirely for long deep-research runs. - settings.py: document that 0 disables the cap; default stays 1800s. - research_handler.py: resolve 0 (or negative) to no timeout (asyncio.wait_for timeout=None); other values stay bounded to [60, 86400] as before. - index.html / settings.js: "Max Time" input bound to research_run_timeout_seconds, validated to {0} ∪ [60, 86400], with copy making explicit that 0 = no limit (unbounded model/API cost). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 20:57:57 +09:00
Kenny Van de Maele	cfb7ec1c71	Accessibility: add labels and toggle states * Accessibility: ARIA labels and toggle states Screen readers couldn't name several icon-only controls or tell whether the tool toggles were on. This adds accessible names and exposes toggle state, with no behavior or layout change. - Icon-only buttons get aria-label: web/shell tool toggles, the "more tools" overflow button (+ aria-haspopup), and the color-reset buttons. - Unlabeled inputs/selects get aria-label: memory + skills search, model-picker search, memory sort, theme font/density selects, and the new-memory / skill (title, when-to-use, how, tags) fields, which only had a visual floating label. - Toggle state via aria-pressed, kept in sync at the existing .active write sites: web/shell toggles (setupToggle) and the Agent/Chat mode buttons (initModeToggle). Static aria-pressed added in the markup so the attribute exists before JS runs. Scope: first slice of the ROADMAP accessibility pass. Focus-visible/contrast, reduced-motion, and modal dialog roles/focus-trap are left for follow-ups. Checks: node --check static/app.js. No Python touched. * Accessibility: mark chat log busy while streaming The chat log is an aria-live="polite" region, so streaming a response token-by-token made screen readers announce every partial update — noisy and unreadable. Set aria-busy="true" on #chat-history while a response streams and back to "false" in the stream's finally block. Assistive tech then waits for the settled message and announces it once. Checks: node --check static/js/chat.js.	2026-06-02 20:55:05 +09:00
ooovenenoso	fb0f8484d7	Sessions: confirm chat delete actions - confirm sidebar/session-list chat deletes - confirm library chat menu deletes - confirm archived chat permanent deletes	2026-06-02 20:43:34 +09:00
Shaw	8115cb01a2	Models: allow API keys for local endpoints Self-hosted endpoints on a LAN are sometimes protected by an API key. The admin "Local" add/test form only sent base_url (+ model_type), so such an endpoint could not be added — it just errored out — even though the backend POST /api/model-endpoints and /model-endpoints/test already accept an optional api_key form field (the cloud "API" form already uses it). Adds an optional masked "API key" input (adm-epLocalApiKey) to the Local form and wires it into the local Test and Add handlers, sending api_key only when filled (an empty value is omitted so we never send a blank Bearer). The field is cleared after a successful add, matching the cloud form. Tested: tests/test_local_endpoint_api_key_js.py extracts the two click handlers and runs them under node with mocked DOM/FormData/fetch, asserting api_key is sent when the field is filled and omitted when blank, plus that the input exists as a password field. `node --check static/js/admin.js` passes. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:36:54 +09:00
mist	0b0be3c339	Email: recognize forwarded message dividers `_ORIG_RE` (and its JS mirror `_TALON_ORIG_RE`) already recognised the Japanese forward marker `転送` alongside the "Original Message" delimiters, but not the English "Forwarded message" one. So Gmail-style forwards — including the ones Odysseus itself emits (`---------- Forwarded message ----------`, static/js/emailInbox.js) — were not treated as a quote boundary: - with a following Outlook From:/Date: header block, the divider line leaked into the level-0 reply bubble as noise; - with only the divider marking the forward (no header block), the body was not split into turns at all. Add `Forwarded\s+message` to the same `[-_=]{3,}`-delimited alternation in both the server-side parser and the JS mirror, so forward dividers are consumed as an attribution boundary like "----- Original Message -----". Locale variants of "Forwarded message" can follow the existing pattern. Tests cover both manifestations plus a negative control (the bare words "forwarded message" without `[-_=]{3,}` delimiters must not split). Checks: python -m pytest tests/test_forwarded_message_divider.py (3 passed), python -m py_compile src/email_thread_parser.py, node --check static/js/emailLibrary/utils.js, git diff --check.	2026-06-02 20:32:56 +09:00
tanmayraut45	b5747e3979	Sessions: ignore list keydown while typing The list keyboard handler (_onSessionListKeydown) treats Backspace and Delete as "delete the focused session". When the user double-clicks a chat to rename it, an <input class="session-rename-input"> is mounted inside the .list-item row. Backspace on the input bubbles up to the list container, the handler walks closest('.list-item[data-session-id]') from e.target, finds the parent row and DELETEs the session via the API — so a single typo correction nukes the whole conversation. Bail out at the top of the handler when e.target is an INPUT, TEXTAREA, or contentEditable element. Arrow / Enter / Delete navigation still works for rows themselves (the row is the focused element then, not the input). Mirrors the guard pattern already used in ui.js, notes.js, tasks.js, calendar.js, emailLibrary.js and galleryEditor.js. Closes #1007.	2026-06-02 20:30:16 +09:00
ghreprimand	431b98525b	Email: persist bulk read state to provider Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 20:28:01 +09:00
mechramc	9d0a18a5b5	Email: add explicit SMTP security mode	2026-06-02 13:15:06 +09:00
danielxb	5268a546bc	Model picker: group models by provider Rebased on current main. Integrates with the new Recent/Favorites system — provider groups appear below Recent and Favorites in browse mode for large catalogs (>12 models). Changes: - Models grouped by canonical provider with collapsible sections - Chevron animation consistent with sidebar sections - Domino cascade on expand (only on just-opened group) - Provider display names (deepseek-ai -> DeepSeek, meta -> Llama, etc.) - Alias merging (meta + meta-llama -> one Llama group) - Search includes provider display names for filtering - Collapsed state persists in localStorage - No screenshot binary committed Co-authored-by: danielxb <5981902+danielxb@users.noreply.github.com>	2026-06-02 13:14:22 +09:00
spooky	cd4f496cb4	Fix native Cookbook quant classification	2026-06-02 13:07:20 +09:00
Stephen Yue	d46c406bd8	Fix Cookbook fit column sorting The Fit column shared the Score column's sort key, so clicking the Fit header sorted by Score instead of by hardware fit. There was also no fit option in the hidden sort <select> and no fit branch in the client-side comparator. - Give the Fit column its own sort key (fit). - Add a fit option to the sort select (kept Score as the default so first-load ordering is unchanged). - Sort by the categorical fit_level rank (perfect > good > marginal > too_tight), tie-broken by score, honoring the ascending/descending toggle. Fixes #842 Co-authored-by: SabixMaru <285860855+SabixMaru@users.noreply.github.com>	2026-06-02 13:05:53 +09:00
Juan Pablo Jiménez	eda99360d1	Fix Cookbook dependency install completion state * Fix Cookbook dependency install completion state Mark Cookbook dependency installs as complete when the background runner exits successfully, even when HuggingFace-specific download markers are absent. * Add focused regression coverage for cookbook dependency completion. Keep the fix narrowly scoped while carrying env_path through dependency tasks and locking the completion reconciliation behavior with targeted tests.	2026-06-02 12:59:29 +09:00
Boody	97528be0f4	Add custom web search result count * fixed confusing credentials prompt * fix(setup): return status from create_default_admin function * fix(setup): initialize admin creation status in main function * fix(setup): enhance admin creation feedback and status handling * Enhance admin user login messages with conditional feedback based on creation status * Refine admin user creation feedback messages for clarity and actionability and formatted code * Add fallback error message for admin creation failure in setup script * Add run script for Uvicorn with dotenv integration * Refactor server runner to use argparse for host and port configuration * Remove captured output print statement from server runner * Fix server runner to ensure cross-platform compatibility and improve log handling * Remove run.py script to match main repo * feat: add custom option for search result count in settings * fix: enforce minimum and maximum values for custom search result count	2026-06-02 12:55:15 +09:00
spooky	0f3280ee05	Expose advanced llama.cpp serve controls	2026-06-02 12:46:16 +09:00
Zeus-Deus	19a4f823a4	Rename Character copy to Persona Issue #234: the "Character" tab and its "Style of response" label made it unclear that this is where a system prompt is set. Rename the user-facing labels for clarity: - "Character" tab + section heading -> "Persona" - "Style of response" -> "System prompt" - supporting strings: select placeholder, name placeholder, button/title text, toasts, confirm/notice text, the chat-bar indicator tooltip, the settings visibility toggle, and the assistant personality picker ("Characters" optgroup -> "Personas"). Used "Persona" rather than the issue's suggested "Preset" because the app already has a distinct, user-facing "Presets" concept (built-in presets like Code Analyze/Brainstorm/Reason, shown as their own group in the assistant picker). "Persona" matches what this tab actually creates -- a named persona with its own memories -- without colliding with that term. Internal identifiers (element IDs, data-chartab attributes, function names) and the character_name backend field are intentionally left unchanged so existing saved presets and JS wiring keep working.	2026-06-02 12:42:15 +09:00
Collin	c90a7a19a5	Add dialog accessibility semantics Screen readers got no signal that a dialog opened — not one modal carried role="dialog" — and several close buttons had no accessible name. - The 6 static tool windows (Brain, Theme, Prompt, Rename session, Cookbook, Settings) now carry role="dialog" + an accessible name. They are dockable, tiling windows, so they are non-modal dialogs (intentionally no aria-modal). - The four unlabelled close buttons (theme, prompt, cookbook, settings) get an aria-label so they no longer read as just "heavy multiplication x". - styledConfirm / styledPrompt ARE blocking modals: they get role="dialog" + aria-modal="true" + aria-labelledby/aria-describedby, and now manage focus — restore focus to the triggering element on close and trap Tab within the dialog (they already moved focus in on open). tests/test_dialog_aria.py pins the roles, labels, and focus management.	2026-06-02 12:41:25 +09:00
Leo	6fca7e86b7	Cookbook serve profiles and engine filter * Cookbook: Engine filter + intelligent hardware-computed serve profiles Two related Cookbook serving improvements for accurate, hardware-aware model serving (especially on consumer GPUs that can only run GGUF/llama.cpp). Engine filter - New "Engine" dropdown (All / llama.cpp / vLLM / SGLang) beside the quant picker. Pure client-side view filter over the fetched list via the same _detectBackend() the serve commands use, so what you filter to is exactly what would launch. Re-renders from cache (no refetch). Empty-state message + the instant-cache-paint path account for it too. Intelligent serve profiles (Quality / Balanced / Speed) - services/hwfit/profiles.py: compute_serve_profiles() turns detected VRAM + model size into concrete llama.cpp flags (n_gpu_layers, n_cpu_moe, cache-type, context). Encodes the by-hand tuning: a too-big MoE offloads experts to CPU instead of failing; a model that fits stays fully on GPU; quant tracks profile intent; vision models keep image-encoder headroom. Reuses models.py VRAM math so filtering and serving agree on what fits. Pure/deterministic (no t/s claims — partial-offload speed isn't reliably predictable; fit is what's computed). - /api/hwfit/profiles endpoint returns the profiles + the model's trained context limit, with loose name matching (strips org/ prefix, -GGUF suffix, quant tag) so a local GGUF folder name resolves to its catalog entry. - _buildServeCmd (llama.cpp) now emits --n-cpu-moe / --flash-attn / --cache-type-k/v when set, with llama-cpp-python fallback equivalents. It previously only set -ngl/-c, which is why it OOM'd or ran slow. - Serve panel: profile chips that fill the fields on click, plus CPU-MoE / KV Cache / Flash Attn fields. Context is clamped to the model's trained limit (and an absolute 1M sanity ceiling) on type/blur/profile-load and at launch — fixes a crash where a stale 256k/16M preset + quantized KV cache caused an amdgpu ErrorDeviceLost. Tests: tests/test_serve_profiles.py (7) — offload vs full-GPU fit, never exceed VRAM, context cap, launchable flags, vision headroom, no-GPU empty. Checks: py_compile + node --check pass; pytest test_serve_profiles + test_hwfit_amd green; verified live on an RDNA4 box (gfx1200) — Balanced lands ~ncm18 q4 128k, matching hand-tuning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Cookbook: make column-header sorting discoverable (incl. Newest) Sorting in Cookbook is via clickable column headers (pewds' design), but the headers had no visual cue that they're interactive — so sorting in general, and the Newest sort on the Model header specifically, was undiscoverable. - Style sortable headers as interactive: pointer cursor, hover underline, and the active sort column bolded/highlighted. There was no CSS for .hwfit-sortable / .hwfit-sort-active at all; this helps every existing sort, not just Newest. - The Model column header sorts by release_date (newest first), reusing the existing header-click sort wiring and the "newest" SORT_KEY. No new sort control — uses the existing column-header paradigm. Checks: node --check passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Cookbook serve profiles: keep the on-disk file's quant fixed (don't propose Q6/Q2) In the Serve tab the model is a specific GGUF file already on disk, so its quant can't change — but the profiles were suggesting "Quality · Q6_K" / "Speed · Q2_K" as if you could re-quantize it. That's meaningless when serving a fixed file. - compute_serve_profiles gains serve_weights_gb / serve_quant. When set (SERVE mode), the quant is locked to the file's and profiles differ only in the real serving knobs — n_cpu_moe, KV-cache type, context. _weights_gb / _cpu_moe_for_budget use the file's actual size instead of a quant-derived estimate. DOWNLOAD mode (no override) still varies the quant to show download options. - /api/hwfit/profiles accepts serve_weights_gb & serve_quant. - The Serve panel parses the file's size (from m.size "20.6 GB") and quant (from the repo/file name) and passes them, so profiles match what's actually served. Result for a 20.6 GB Q4_K_M file: all three profiles stay Q4_K_M and differ by KV/ctx/offload (Quality q8 KV 128k ncm21, Balanced q4 128k ncm17, Speed q4 32k ncm15) — no nonsensical quant changes. Tests: test_serve_mode_keeps_fixed_quant. Full serve-profile suite green (9). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Cookbook serve: Vision toggle (auto-find mmproj) + live VRAM/RAM-spillover monitor Two serve-panel additions: 1. Vision toggle. A "Vision" checkbox that serves the model with its multimodal projector so it can read images. The mmproj path is resolved at runtime (find mmproj-.gguf next to the model), so dropping an mmproj file in the model folder makes the toggle just work; `--mmproj … --image-max-tokens 1024` (native) / `--clip_model_path` (llama-cpp-python) only when on + found. 2. Live GPU-memory monitor.* A readout that polls /api/cookbook/gpus every 4s while the panel is open and shows VRAM used/total/%, free, and — crucially on a discrete card — RAM spillover (AMD gtt_used_mb), with a plain-language health hint: green/healthy, amber/tight, red/"spilled to RAM — slow (raise CPU MoE or lower context)". Surfaces gtt_used_mb from the gpus endpoint (previously read for total only and discarded for 'used'). Lets you see at a glance whether a config fits VRAM (fast) or is paging to system RAM over PCIe (slow) instead of guessing. Checks: node --check + py_compile pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 12:34:42 +09:00

1 2 3 4

153 Commits