Commit Graph

146 Commits

Author SHA1 Message Date
SurprisedDuck
7268c49992 Make LLM host health maps thread-safe
The synchronous llm_call() runs in FastAPI's threadpool (sync route
handlers such as POST /sessions/auto-sort), while llm_call_async() runs
on the event loop. Both mutate the module-level _response_cache,
_host_fails and _dead_hosts dicts, so these are touched from multiple OS
threads concurrently. Two races result:

- _set_cached_response() snapshots 64 keys then deletes them with
  `del _response_cache[key]`; if another thread evicts the same key
  first, the del raises KeyError mid-eviction. Switched to
  pop(key, None).
- _mark_host_dead() does get()+1+set() on _host_fails with no lock, so
  concurrent connect failures lose increments and a genuinely dead host
  can stay under its cooldown threshold. Guarded the host-health maps
  with a threading.Lock (also applied to _is_host_dead / _clear_host_dead
  for consistent reads).

Adds tests/test_llm_core_concurrency.py with deterministic regression
tests (phantom snapshot key for the eviction race; a slow-read dict that
forces the lost-update window for the counter). Both fail on the
unpatched code and pass with the fix.
2026-06-02 05:54:23 +09:00
ooovenenoso
cd6041477c Refresh local model context after restart
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-02 05:54:06 +09:00
Prakhya
a96593a99b Improve Ollama endpoint error messages 2026-06-02 05:53:50 +09:00
SurprisedDuck
7a830e504d Escape email fold summary metadata
The email reader folds quoted history into <details> summaries via
`_foldSummary()` (static/js/emailLibrary/signatureFold.js), which builds a
sender/date "meta" chip into the summary HTML and assigns it to innerHTML.
The server-side thread parser (`_extract_quote_meta`,
src/email_thread_parser.py) strips tags but then un-escapes HTML entities
and preserves `<...>` patterns, and that raw meta reaches `_foldSummary`
unescaped via `_renderTurnsFromServer` (`t.meta`) — so an inbound email
whose quoted attribution contains `From: &lt;img src=x onerror=...&gt;`
runs script when the victim merely opens the message (stored XSS).

Make `_foldSummary` the single escaping chokepoint: escape `primary` and
`subMeta` with the module's existing `_esc`. The client-side
`_extractQuoteMeta` previously pre-escaped its output, and every consumer
of it routes through `_foldSummary`, so drop that now-redundant escaping to
avoid double-encoding (e.g. "Ben & Jerry" -> "Ben &amp;amp; Jerry").

Verified (jsdom): server-raw and client-extracted malicious metas yield 0
live elements and 0 event-handler attributes; benign "Ben & Jerry" renders
single-escaped.

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-02 05:50:53 +09:00
Yatsuiii
63d93ff211 Normalize stored usernames on auth load
verify_password() and create_session() both call .strip().lower() on
the incoming username, but _load() stored keys verbatim from auth.json.
Any mixed-case key (e.g. written by manual edit or a future migration)
would never match, producing a permanent 'Invalid credentials' error.

Fix: lowercase all keys at load time so the in-memory dict always
matches what the login path expects.

Fixes #423
2026-06-02 05:50:36 +09:00
Afonso Coutinho
5da662441c Validate slash command time minutes
* fix: reject hour > 23 in 'today/tomorrow' reminder time parsing

* fix: reject minute > 59 in reminder time parsing
2026-06-02 05:50:19 +09:00
Elle
d885c70462 Treat Docker host gateway as local
When running Odysseus in Docker and connecting to a local LLM on the host machine (e.g. `llama.cpp` or `Ollama`), the standard endpoint `http://host.docker.internal` is used to breach the container network. 

Because `host.docker.internal` was missing from `_LOCAL_HOSTS`, Odysseus incorrectly treated local self-hosted models as cloud APIs. This triggered the fallback behavior where actual API-reported context limits were being ignored and overridden by hardcoded fallbacks in `KNOWN_CONTEXT_WINDOWS`.

**Changes**
- Added `"host.docker.internal"` to the `_LOCAL_HOSTS` whitelist in `src/model_context.py` so that Dockerized deployments correctly trust and respect the context limits of locally hosted models.

**Checks Ran**
- [x] Syntax check (`python -m py_compile src/model_context.py`)
- [x] Tested manually in Docker (`docker compose up -d --build`) on a Windows host using `llama-server`. The correct API context length is now correctly reported in the UI instead of falling back to the 131k hardcode.
2026-06-02 05:49:59 +09:00
2revoemag
3ef88fc7ff Recognize Gemma as tool-capable
Gemma models (gemma-2/3/4) support OpenAI-style function calling, but
"gemma" was missing from the _model_supports_tools heuristic in
stream_agent_loop(). On a non-allowlisted endpoint (e.g. a self-hosted
OpenAI-compatible server), a Gemma-backed agent therefore never receives
native tool schemas and falls back to the prompt-text tool-call
convention — which Gemma does not follow. The result is that tool calls
are emitted as raw text and never execute.

Add "gemma" to the capability keyword list alongside the other
tool-capable families.

Co-authored-by: 2revoemag <2revoemag@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-06-02 05:49:43 +09:00
pewdiepie-archdaemon
e03491664a Stabilize security regression tests 2026-06-02 05:48:59 +09:00
Collin
70a71f603c Scope email calendar extraction to account owner
The email auto-calendar pass (settings.email_auto_calendar / the
extract_email_events task) scans recently received mail and lets an LLM
create / update / cancel calendar events. Two problems made it a cross-tenant,
remotely triggerable hole:

1. No owner scoping. _auto_summarize_pass(account_id=None) fans out over EVERY
   enabled account of EVERY user. For each message it fetched an upcoming-events
   snapshot with NO owner filter (all tenants' events) and handed those uids +
   titles to the extraction LLM, then executed the model's ops via
   do_manage_calendar(...) with owner=None. do_manage_calendar only filters by
   owner when owner is not None, so create/update/delete ran across ALL users'
   calendars. Net: every user's event titles/times were disclosed to the model,
   and the model could cancel/move/duplicate any tenant's events by uid.

2. No prompt-injection wrapping. The raw email From/Subject/body were
   interpolated straight into an instruction-shaped extraction prompt (unlike
   the chat path, which wraps external text via src/prompt_security). Anyone
   who can email a user whose instance has auto-calendar enabled could inject
   operations: create attacker-controlled "meeting" events (the path even
   auto-harvests URLs from the body into the event location/description — a
   phishing primitive) or cancel/modify the victim's real events, with zero
   human in the loop.

Fix:
- Add core.database.get_upcoming_events(owner) and use it for the snapshot, so
  the LLM only ever sees the processed account owner's events.
- Look up the EmailAccount owner in _auto_summarize_pass_single and pass owner=
  to every do_manage_calendar call, so create/update/delete are scoped to that
  user (owner=None stays the single-user / legacy escape hatch).
- Tell the extraction model the email is untrusted data and not to follow
  instructions inside it (defense-in-depth against injection).

Add tests/test_calendar_owner_scope.py: get_upcoming_events returns only the
given owner's events (and everything when owner is None). Fails against the old
unscoped query.
2026-06-01 23:12:32 +09:00
Collin
11c2931efb Run auth password work off the event loop
* fix: run bcrypt off the event loop in auth routes

The auth routes are async, but each bcrypt call ran synchronously on the event
loop. bcrypt (checkpw/hashpw) is intentionally CPU-expensive (~100-300 ms), so
every login / signup / setup / change-password froze the single event loop for
that window, stalling all other in-flight requests (chat streams, polling, ...).

/api/auth/login is the worst case: it is reachable unauthenticated, runs bcrypt
twice (verify_password, then create_session re-verifies), and is rate-limited
only per-IP. A burst of login attempts serializes the whole server — cheap
DoS amplification.

Offload the bcrypt-bearing AuthManager calls (setup, signup/create_user,
login's verify_password + create_session, change_password) via
asyncio.to_thread, matching how the codebase already offloads blocking work
(e.g. src/builtin_actions._run_subprocess, email summarize). The event loop
stays responsive while bcrypt runs on a worker thread.

Add tests/test_auth_event_loop.py: asserts login runs verify_password and
create_session on a worker thread, not the loop thread. Fails if those calls
are awaited inline again.

* test: isolate auth event-loop test from heavy core/* import chain

The regression test imported routes.auth_routes, which pulls in
core.auth and so triggers core/__init__.py — transitively importing
src.llm_core (hangs at import under the project venv) and the SQLAlchemy
declarative models (metaclass error on a bare core.database import / under
the conftest sqlalchemy stubs). Reported by the maintainer: collection
failed on system Python and hung under the venv.

Stub core.auth/core.database before the import, mirroring the existing
_ensure_stub pattern in test_auth_regressions.py and test_null_owner_gates.py.
AuthManager is only a type hint here and the handler is exercised with a
MagicMock, so no real core machinery is needed. Test now imports cleanly
and passes in <0.3s without bcrypt/sqlalchemy installed.
2026-06-01 23:12:12 +09:00
kanaru-dev
a51a1fc4fc Deep-scrub secrets from public settings
/api/auth/settings is auth-exempt (the frontend + the pre-login page read it for
keybinds/TTS prefs), so non-admin and unauthenticated callers get a scrubbed
copy. The previous scrub only blanked TOP-LEVEL string values whose key matched a
short suffix list — so a secret nested under a non-secret parent key, or stored
under a key outside the list, would leak. A real exposure when the app is
reachable over a Cloudflare tunnel / reverse proxy.

- src/settings_scrub.py: NEW stdlib-only module with the scrub helpers (deep/
  recursive; broadened secret-key patterns). Kept separate from auth_routes so it
  imports + unit-tests WITHOUT pulling the FastAPI / auth / database chain
  (addresses review: the test no longer fails at collection on the DB import).
- routes/auth_routes.py: import scrub_settings from the module.
- tests/test_settings_scrub.py: import the tiny module directly.

Ran: pytest tests/test_settings_scrub.py (8 passed); verified the test pulls no
db/auth modules into sys.modules; py_compile routes/auth_routes.py.

Co-authored-by: Kanaru92 <107661007+Kanaru92@users.noreply.github.com>
2026-06-01 23:11:50 +09:00
Konstantinos Grontis
35f11f2edc Fix sidebar text clipping on Windows 2026-06-01 23:11:19 +09:00
Ernest Hysa
47a6b510e1 Preserve system messages during context compaction
The context compactor computed split_point against convo_msgs (system
messages filtered out) but applied it directly to session.history which
includes the system messages. After compaction, the original system
prompt was dropped and replaced by an off-by-N slice of the full history.

This silently dropped the system prompt (preset, persona, RAG context)
from every compacted session — the model would lose persona, RAG, and
preset guidance on the next turn after a long conversation.

The split in maybe_compact does:
  convo_msgs = [m for m in messages if m['role'] != 'system']
  split_point = len(convo_msgs) // 2
so split_point is indexed against the system-stripped list. But the
helper _update_session_history took (session, split_point, summary) and
did session.history[split_point:]. session.history is the full list
including the leading system messages, so this dropped the first
system_msg_count messages.

Fix: pass system_msg_count=len(system_msgs) into _update_session_history
and use session.history[system_msg_count + split_point:] as the recent
slice, with session.history[:system_msg_count] prepended to preserve
persona/preset/RAG system messages.

Validated: tests/test_compactor_data_loss.py both tests now pass (were
failing). tests/test_context_compactor.py 12 pre-existing tests still
pass.

Symptom was: post-compaction history = [summary] + assistant_1 + user_2
+ assistant_2 (system_A was lost).

Co-authored-by: Ernest Hysa <ernest@example.com>
2026-06-01 23:10:58 +09:00
ooovenenoso
5e47e69e99 Allow serving cached local llama.cpp models
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-01 23:10:08 +09:00
Afonso Coutinho
9b1acf6612 Fix year extraction in research queries
* fix: extract full year in research query entities, not just the century

* fix: same year capture-group bug in the services search copy

* test: research query extracts the full year
2026-06-01 23:09:41 +09:00
Areon Lundkvist
f853a3fc67 Harden streaming deltas against null payloads 2026-06-01 23:09:17 +09:00
Mikael A
e7d61c724f Let calendar handle Escape while open 2026-06-01 23:08:57 +09:00
Yizreel Schwartz Sipahutar
42380a8693 Keep Cookbook POSIX paths stable on Windows hosts 2026-06-01 23:08:39 +09:00
Steven French
4bbf82c2ab Fix macOS launcher Python path usage 2026-06-01 23:08:20 +09:00
Strahil Peykov
370fe6b501 Warn when localhost auth bypass is enabled 2026-06-01 23:08:01 +09:00
LittleLlama
74dedcad37 Remove duplicate tool index startup warmup
get_tool_index() calls index_builtin_tools() on first init
(src/tool_index.py:469-470), and _warmup_tool_index then calls it
explicitly right after. Every cold boot embeds all 58 built-in tools
twice and double-upserts them into the ChromaDB collection.

The remaining get_tools_for_query call still pre-warms the query path.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 23:07:42 +09:00
pewdiepie-archdaemon
7711e14f90 Polish email reply and task controls 2026-06-01 23:02:25 +09:00
spooky
033852ab14 fix: require GGUF sources for llama downloads (#368) 2026-06-01 22:47:47 +09:00
pewdiepie-archdaemon
f2d55f8726 Fix cached GGUF model metadata in Cookbook Serve 2026-06-01 22:46:54 +09:00
pewdiepie-archdaemon
743c074b2e Harden Cookbook package SSH probe 2026-06-01 22:44:34 +09:00
pewdiepie-archdaemon
e5b927597e Fix Cookbook serve exit code reporting 2026-06-01 22:41:25 +09:00
spooky
15822e91ff fix: keep serve preflight errors visible (#398) 2026-06-01 22:40:06 +09:00
spooky
4b72dd407b fix: report serve dependency readiness (#412) 2026-06-01 22:39:36 +09:00
red person
39cec53284 Normalize setup admin username (#448) 2026-06-01 22:38:56 +09:00
Duarte Antunes
448401a0fc Harden PDF document markers against cross-owner upload access (#445)
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 22:38:14 +09:00
red person
b2e8d692a4 Scope personal RAG uploads by owner (#446) 2026-06-01 22:36:53 +09:00
red person
d36896c5f7 Gate image editor AI endpoints by privilege (#447) 2026-06-01 22:35:24 +09:00
william-napitupulu
758a1824c7 Update Styles.css (#463)
Small update to the styles that bothered me, i noticed in the window/modal for calendar when editing a day the time icons had a mask that overlapped the icon.  I simply added 'background-image: none' prop to it/
2026-06-01 22:34:24 +09:00
red person
e1102585bf Fix chat stream recovery and PDF library indexing (#468) 2026-06-01 22:33:35 +09:00
Filip
92a81480f7 feat: allow memory import without session (#493) 2026-06-01 22:32:17 +09:00
Dr-Shadow
7be4ece224 Allow to customize the render GID to match the one on the host (#515) 2026-06-01 22:31:33 +09:00
Carlos Arroyo
00320972dc fix: CUDA/GPU detection for vLLM and llama.cpp in Docker (#479)
Two bugs caused GPU inference to silently fall back to CPU inside the
Odysseus Docker container even when the GPU was correctly passed through.

## entrypoint.sh — CUDA_HOME detection only covered CUDA 13.x wheels

The nvcc glob only searched
vidia/cu13, which matches the

vidia-nvcc-cu13 pip wheel layout. CUDA 12.x wheels install nvcc to

vidia/cuda_nvcc/bin/nvcc (nvidia-cuda-nvcc-cu12) or
vidia/cu12
(nvidia-nvcc-cu12) — completely different paths. The glob found nothing,
so CUDA_HOME was never set.

Worse, VLLM_USE_FLASHINFER_SAMPLER=0 was inside the same if-block, so it
was never set either. vLLM then tried to JIT-compile the FlashInfer
sampler at startup, failed with 'Could not find nvcc', and crashed — even
though the GPU was fully visible to the container.

Fix: expand the search to also check nvidia/cu12 and nvidia/cuda_nvcc.
Move VLLM_USE_FLASHINFER_SAMPLER=0 to an unconditional export after the
loop (it is sampler-only, no impact on the attention path, and the correct
setting for any container where CUDA headers may be incomplete).

## cookbook_routes.py — llama.cpp Linux source build silently fell back to CPU

The cmake invocation was:
  cmake -B build -DGGML_CUDA=ON 2>/dev/null || cmake -B build

2>/dev/null suppressed all configure errors. When nvcc is absent (the
slim base image has no CUDA toolkit — intentional), cmake fails silently,
then the || fallback re-runs without -DGGML_CUDA=ON. A CPU-only binary is
produced with no warning. Additionally, a stale CMakeCache.txt from the
failed CUDA attempt was reused (no rm -rf build), poisoning the next
configure run. The macOS branch already did rm -rf build for exactly this
reason; the Linux branch did not.

Fix: before cmake, detect pip-installed nvcc across the same three path
patterns as entrypoint.sh and expose it via CUDA_HOME/PATH. If nvcc is
found, run a clean CUDA build with full error visibility. If not, fall
back to a CPU build with an explicit warning telling the user how to get
a GPU build (install vLLM via Cookbook -> Dependencies, which brings the
CUDA wheels including nvcc, then re-launch).

## .env.example — document Windows COMPOSE_FILE separator

Added a comment showing the semicolon separator required on Windows
Docker Desktop alongside the existing colon-separator (Linux) example.
2026-06-01 22:30:51 +09:00
Alexander Kenley
3c6b084f08 Secure by default uplift (#511)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:30:07 +09:00
roxsand12
766ddcaa99 fix: add _setup_lock to prevent race condition in first-run setup (#508) 2026-06-01 22:29:03 +09:00
Sanjay Davis
508fabcb3b Restore dependency refresh after install AND persist safe download mode on retries. (#499) 2026-06-01 22:28:06 +09:00
Afonso Coutinho
c38932e6c6 fix: deep research discards valid sources mentioning cookies/copyright (#481)
* fix: drop over-broad 'cookie'/'copyright' low-quality markers

* fix: detect cookie/copyright boilerplate via phrases, not bare words

* test: keep research findings that merely mention cookies or copyright
2026-06-01 22:26:37 +09:00
Alexander Kenley
07d92556a3 Fix visual report chapter navigation (#505)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:26:13 +09:00
vidvuds
6ad617931d Fix import-review list not scrolling in Brain modal (#509)
The memory import-review list (.memory-suggestions) is shown inside the
overflow:hidden .admin-card but, unlike the sibling .memory-list, it had
no scroll bounding of its own (no flex:1 / min-height:0 / overflow-y).
A long review list therefore grew past the card and was clipped, leaving
lower entries and their controls unreachable with no usable scroll area.

Give .memory-suggestions the same flex:1 + min-height:0 + overflow-y:auto
bounding the memories list already uses so the review list scrolls
internally within the modal. Pin the review header (the title and the
save all / back controls) with position:sticky so they stay visible while
the items scroll under them, and add a small scrollbar gutter so the bar
does not sit flush against the item cards.

Fixes #455
2026-06-01 22:25:16 +09:00
Cosmin Enache
04fd963394 Fix duplicate compare modal on repeated clicks (#491)
Co-authored-by: cosminae <cosmin.e@annavas.io>
2026-06-01 22:24:27 +09:00
Afonso Coutinho
1eff46579a fix: ChromaDB unreachable blocks app startup for 30-60s (#326) (#476)
* fix: fail fast when ChromaDB is unreachable instead of blocking startup

* fix: only cache the ChromaDB client after a successful heartbeat

* test: cover ChromaDB fast-fail preflight and no-cache-on-failure
2026-06-01 22:22:41 +09:00
Jamieson O'Reilly
171c29dcf3 Fix email-thread HTML injection, attachment path traversal, and missing authz (#475)
Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):

- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
  inbound HTML through the allowlist sanitizer, but the two threaded paths
  (_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
  server-parsed `body_html` raw into the DOM. A crafted inbound email could
  inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
  if a deployment relaxes the script CSP). Now sanitized on all paths.

- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
  on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
  user-controlled folder/uid and no containment, so a folder like `../../tmp`
  could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
  single safe segment and asserts containment.

- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
  /api/rag/stats, /api/test/youtube, /api/test-research relied only on the
  global session check (any logged-in user). Now require_admin-gated.

- Defense-in-depth HTML escaping: session HTML export escapes the session name
  (routes/session_routes.py); the MCP OAuth page escapes the reflected Host
  header / server_id (routes/mcp_routes.py).

- Internal-tool token now compared with secrets.compare_digest (constant time)
  in core/middleware.py and app.py.

Adds regression tests in tests/test_security_regressions.py.
2026-06-01 22:20:17 +09:00
Abhinav
9e8de43f25 fix: clear session headers on endpoint deletion (#477) 2026-06-01 22:19:54 +09:00
pewdiepie-archdaemon
5ed9b74cd0 Polish email tasks and window controls 2026-06-01 20:56:46 +09:00
red person
5c390d6b3e Fix sidebar brand text clipping (#362) 2026-06-01 19:04:08 +09:00