Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
"stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
Q4_K_M -> BF16 (it is the native base, not a quant).
Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
required; bare "Download complete" no longer marks the task done after
the first config file. Clear button now sends tmux kill-session too.
True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.
CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
SkillsManager.update_skill walks every SKILL.md on disk and matches by
slug only; the 'owner' key in its scalar_keys whitelist meant a caller
could pass updates={'owner': 'attacker', 'description': 'pwned'} and the
first matching file on disk got silently re-owned. Two users with the
same slug under different category directories (which is supported by
the on-disk layout <category>/<name>/SKILL.md) could each stomp the
other's skill via the manage_skills tool or the in-process callers in
tool_implementations.py (edit, patch, publish, delete).
update_skill and delete_skill now require the caller's owner and only
match a file whose parsed owner field matches. The default of None
means 'no scope' and only matches ownerless skills, so an unsafe call
without an explicit owner is now a no-op. 'owner' is also removed from
scalar_keys so the updates dict cannot be used to reassign ownership
even when the manager is called from an in-process path that didn't
supply the owner argument.
The in-process callers in tool_implementations.py are updated to pass
owner=owner (which was already in scope at every call site) so the
HTTP and agent paths both go through the scoped check. The HTTP route
at routes/skills_routes.py:1499 was already owner-scoped via
sm.load(owner=user); the fix brings the in-process path up to the
same standard.
After a successful password change, revoke all browser sessions for the
same user except the one that submitted the request. This prevents stale
sessions on other devices from remaining valid after credentials are
updated.
Keep API-token behavior unchanged. The current browser session is
preserved so the user can continue from the tab that changed the
password.
Add focused regression tests for preserving the current session, revoking
other sessions, persisting revocation, and avoiding revocation when the
current password is incorrect.
`core.middleware.require_admin` grants admin to any request whose
`request.state.current_user == "internal-tool"` — the sentinel meant only
for the in-process tool-loopback path. But the normal cookie auth path
(app.py) sets `current_user` to the raw username, and neither `create_user`
nor the signup route reserved that name. As a result an account literally
named "internal-tool" was silently treated as admin by every
`require_admin`-gated route. With self-service signup enabled this is an
anonymous -> admin privilege escalation.
Reserve the full synthetic-owner set the codebase already special-cases —
"internal-tool", "api", "demo", "system" (see `_SYNTHETIC_OWNERS` in
routes/assistant_routes.py and the matching guards in src/task_scheduler.py
and routes/research_routes.py). "api" collides with the bearer-token owner
sentinel; "demo"/"system" would leave a real account denied an assistant
and inconsistently owner-scoped.
Refuse to create or rename into any reserved name (case/space-normalized),
and reject empty usernames while we're here. Adds a regression test.
Co-authored-by: Claude <noreply@anthropic.com>
`mdToHtml` deliberately stashes literal <details> blocks and <a> tags from
the source text *before* the global HTML-escape pass and restores them
verbatim into the string callers assign to `innerHTML` (e.g. chatRenderer's
`b.innerHTML = ...processWithThinking(text)`). Nothing scrubbed those
fragments, so message/agent content containing
`<details><img src=x onerror=...></details>` or
`<a href="javascript:..." onmouseover=...>` executed arbitrary script in
the authenticated page.
Route both stashed fragments through `sanitizeAllowedHtml()`, which parses
them in an inert <template> (no resource loads, no script execution),
removes script-capable elements, and strips event-handler attributes plus
javascript:/vbscript:/data: URL schemes. Hardening details:
- Compare tag names case-insensitively and drop the SVG/MathML foreign-
content roots. An SVG-namespaced <script> has the lower-case tagName
'script', so an HTML-only upper-case check would miss it — a real bypass.
- Sanitize to a fixpoint (re-parse + re-clean until stable) to blunt
mutation-XSS, where re-serializing/re-parsing reshapes the tree.
Benign anchors and <details> blocks are preserved unchanged.
Verified under jsdom against the obvious vectors plus mutation-XSS probes
(svg/math-namespaced <script>, foreignObject, ns-confusion, comment
breakout, template smuggling): no script/iframe element, event handler, or
javascript:/data: URL survives, and benign markup is kept.
Co-authored-by: Claude <noreply@anthropic.com>
Require admin access before serving provider discovery data from
GET /api/providers. This prevents normal authenticated users from
triggering provider discovery or receiving cached provider host data.
Keep GET /api/models available to normal users and leave the existing
admin-only GET /api/discover behavior unchanged.
Add a focused regression test to ensure unauthorized callers cannot
trigger discovery and cannot receive cached provider data.
The synchronous llm_call() runs in FastAPI's threadpool (sync route
handlers such as POST /sessions/auto-sort), while llm_call_async() runs
on the event loop. Both mutate the module-level _response_cache,
_host_fails and _dead_hosts dicts, so these are touched from multiple OS
threads concurrently. Two races result:
- _set_cached_response() snapshots 64 keys then deletes them with
`del _response_cache[key]`; if another thread evicts the same key
first, the del raises KeyError mid-eviction. Switched to
pop(key, None).
- _mark_host_dead() does get()+1+set() on _host_fails with no lock, so
concurrent connect failures lose increments and a genuinely dead host
can stay under its cooldown threshold. Guarded the host-health maps
with a threading.Lock (also applied to _is_host_dead / _clear_host_dead
for consistent reads).
Adds tests/test_llm_core_concurrency.py with deterministic regression
tests (phantom snapshot key for the eviction race; a slow-read dict that
forces the lost-update window for the counter). Both fail on the
unpatched code and pass with the fix.
The email reader folds quoted history into <details> summaries via
`_foldSummary()` (static/js/emailLibrary/signatureFold.js), which builds a
sender/date "meta" chip into the summary HTML and assigns it to innerHTML.
The server-side thread parser (`_extract_quote_meta`,
src/email_thread_parser.py) strips tags but then un-escapes HTML entities
and preserves `<...>` patterns, and that raw meta reaches `_foldSummary`
unescaped via `_renderTurnsFromServer` (`t.meta`) — so an inbound email
whose quoted attribution contains `From: <img src=x onerror=...>`
runs script when the victim merely opens the message (stored XSS).
Make `_foldSummary` the single escaping chokepoint: escape `primary` and
`subMeta` with the module's existing `_esc`. The client-side
`_extractQuoteMeta` previously pre-escaped its output, and every consumer
of it routes through `_foldSummary`, so drop that now-redundant escaping to
avoid double-encoding (e.g. "Ben & Jerry" -> "Ben &amp; Jerry").
Verified (jsdom): server-raw and client-extracted malicious metas yield 0
live elements and 0 event-handler attributes; benign "Ben & Jerry" renders
single-escaped.
Co-authored-by: Claude <noreply@anthropic.com>
verify_password() and create_session() both call .strip().lower() on
the incoming username, but _load() stored keys verbatim from auth.json.
Any mixed-case key (e.g. written by manual edit or a future migration)
would never match, producing a permanent 'Invalid credentials' error.
Fix: lowercase all keys at load time so the in-memory dict always
matches what the login path expects.
Fixes#423
When running Odysseus in Docker and connecting to a local LLM on the host machine (e.g. `llama.cpp` or `Ollama`), the standard endpoint `http://host.docker.internal` is used to breach the container network.
Because `host.docker.internal` was missing from `_LOCAL_HOSTS`, Odysseus incorrectly treated local self-hosted models as cloud APIs. This triggered the fallback behavior where actual API-reported context limits were being ignored and overridden by hardcoded fallbacks in `KNOWN_CONTEXT_WINDOWS`.
**Changes**
- Added `"host.docker.internal"` to the `_LOCAL_HOSTS` whitelist in `src/model_context.py` so that Dockerized deployments correctly trust and respect the context limits of locally hosted models.
**Checks Ran**
- [x] Syntax check (`python -m py_compile src/model_context.py`)
- [x] Tested manually in Docker (`docker compose up -d --build`) on a Windows host using `llama-server`. The correct API context length is now correctly reported in the UI instead of falling back to the 131k hardcode.
Gemma models (gemma-2/3/4) support OpenAI-style function calling, but
"gemma" was missing from the _model_supports_tools heuristic in
stream_agent_loop(). On a non-allowlisted endpoint (e.g. a self-hosted
OpenAI-compatible server), a Gemma-backed agent therefore never receives
native tool schemas and falls back to the prompt-text tool-call
convention — which Gemma does not follow. The result is that tool calls
are emitted as raw text and never execute.
Add "gemma" to the capability keyword list alongside the other
tool-capable families.
Co-authored-by: 2revoemag <2revoemag@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Illustration assets for the PR: login submit-button contrast and the
sidebar keyboard focus ring, before vs after. Whitelist docs/ subfolder
images in .gitignore so curated screenshots are tracked.
First incremental pass at issue #86, focused on the universal entry
points and primary navigation. All changes verified in-browser with the
axe-core engine (0 violations on the surfaces below) plus manual keyboard
testing, on both desktop (1280px) and mobile (390px).
Login / first-run setup (static/login.html)
- Add a real <h1>, wrap content in <main> + <footer> landmarks.
- Mark the decorative boat SVG aria-hidden.
- Errors now use role="alert" so screen readers announce them.
- "Remember me" checkbox is keyboard-focusable (was display:none) with an
accessible name and a focus ring; dynamic 2FA field gets a linked label.
- Darken the brand-red submit button so white text clears WCAG AA 4.5:1
(was ~3.2:1); add visible :focus-visible rings.
App shell (static/index.html, static/style.css)
- Remove invalid role="region" from the <main> chat container (it was
overriding the implicit main landmark).
- Add a persistent, visually-hidden <h1> inside <main> so the page always
exposes one logical level-1 heading — works even on mobile where the
sidebar (with the visible brand) is hidden off-canvas.
- Add a reusable .a11y-visually-hidden utility.
- Raise chat-title, model-picker, settings-helper and notes text contrast
above 4.5:1 (were 2.8-3.9:1).
Keyboard nav + dialogs (static/js/a11y.js - new)
- Make the click-only <div> sidebar navigation (New Chat, Search, Brain,
Calendar, Compare, Cookbook, Deep Research, Gallery, Library, Notes,
Tasks, Theme, account) focusable and Enter/Space-activatable, announced
as buttons (skipping role=button where a nested control would create a
nested-interactive violation). Visible focus ring reused from existing
.list-item:focus-visible.
- Upgrade modals (.modal-content and the docked .notes-pane) to labelled
role="dialog" + aria-modal, and normalise their title to heading level 2
so heading order stays valid. A MutationObserver covers runtime-rendered
rows and modals.
Decorative background canvases (static/js/theme.js)
- Mark all 7 bg-effect canvases aria-hidden.
Notes & Tasks (static/js/notes.js, static/js/tasks.js)
- Label the icon-only Note/To-do toggle pills (fixes a critical
button-name issue) and track aria-pressed state.
- Improve Notes header-button + empty-state contrast.
- Give the Tasks sort <select> an accessible name (fixes a critical
select-name issue).
Remaining data-dense tool modals (Tasks cards, Calendar, Gallery, Email,
Cookbook, Compare, Deep Research) still have muted-text contrast to polish
and are the next incremental step, per the issue's own guidance.
Replace the flat dump of every model in the chat-input picker with a
quick-switch. Opening the picker now shows a search box, an auto-tracked
Recent list (last 5 picks), and a manual Favorites list instead of every
available model crammed into a 280px dropdown. With large catalogs
(e.g. OpenRouter's 350+ models) this was unusable as both a quick-switch
and a browser.
- Recent: each pick is recorded most-recent-first (capped at 5) under a
new odysseus-model-recent key, so the next open has it one click away.
- Favorites: an inline star on every row toggles favorite state and
writes the existing odysseus-model-favorites key, so the sidebar Models
section stays in sync. The star toggles only — it never picks the model.
- Search filters a flat list across the whole catalog; favorited rows
keep their filled star while filtered.
- Small catalogs (<=12 models) still list everything in browse mode so
tiny installs aren't forced to search for a model.
- Touch friendly: stars are always visible (no hover-reveal) and tap
targets grow on narrow screens.
No changes to sidebar visibility defaults.
Closes#399