Commit Graph

412 Commits

Author SHA1 Message Date
lekt8
87babb58d5 fix: SSRF hardening for the custom embedding endpoint URL (#132) (#1206)
POST /api/embeddings/endpoint takes a user-supplied URL and immediately
makes an outbound httpx request to it with no validation. The admin gate
added earlier (PR #80) closed the unauthenticated-access part of #132; this
addresses the remaining request: validate the URL before fetching it.

Odysseus is local-first, so pointing the embedding endpoint at a loopback or
LAN server (local vLLM / llama.cpp / Ollama) is a normal setup — a blanket
private-IP block would break the primary use case. So the guard:

  - always rejects non-HTTP(S) schemes (file://, gopher://, ftp:// …),
  - always rejects the link-local range (169.254.0.0/16, incl. the cloud
    instance-metadata 169.254.169.254 exfil vector) plus multicast /
    reserved / unspecified, and IPv4-mapped-IPv6 forms of the above,
  - keeps loopback/LAN allowed by default, and
  - adds EMBEDDING_BLOCK_PRIVATE_IPS=true for full SSRF lockdown on exposed
    multi-tenant deployments.

Logic lives in src/url_safety.py (stdlib only, resolver injectable) so it is
unit-testable without real DNS; the route calls it before the health-check
request. Covered by tests/test_url_safety.py (8 cases).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 23:46:33 +09:00
red person
258e6fc0d4 fix(ui): allow manual prompt bar resize (#1201) 2026-06-02 23:43:53 +09:00
red person
42ae905df7 fix(models): clear deleted endpoint fallback refs (#1207) 2026-06-02 23:41:04 +09:00
red person
cc6e43da44 Report provider-specific search API keys correctly (#1202)
* fix(search): report provider-specific API keys

* fix(search): include provider env keys in status
2026-06-02 23:37:15 +09:00
lekt8
f2f437f4a8 feat: add /api/ready readiness probe (DB, data dir, local-first) (#1200)
/api/health is a liveness ping. This adds /api/ready as a readiness /
integrity self-check that returns 503 unless every critical subsystem is
whole, so an orchestrator (Docker/Compose/k8s) can gate traffic on real
readiness rather than mere process liveness:

  - database: opens a connection and runs SELECT 1
  - data_dir: confirms the data directory exists and is writable
  - local_first: reports whether storage stays on the host (informational;
    a remote database is a valid deployment, so it never fails readiness)

The check logic lives in src/readiness.py so it is unit-testable in
isolation; the route is a thin wrapper. Covered by tests/test_readiness.py.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 23:33:22 +09:00
red person
76a7685105 fix(models): clear stale speech endpoint settings (#1196) 2026-06-02 23:32:01 +09:00
red person
69ab350919 fix(ui): keep minimized windows above composer (#1197) 2026-06-02 23:31:09 +09:00
red person
0db441b191 fix(ui): contain email split divider (#1194) 2026-06-02 23:28:24 +09:00
Mayank Ukey
f96edfe5ca fix: deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent (#1169)
* fix: exclude deepseek from local tool-calling keyword list

deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent.
The cloud API (api.deepseek.com) is already caught by the _API_HOSTS
check, so the generic 'deepseek' keyword match was only causing false
positives for local Ollama-served models.

* fix: add model no-tools blocklist and regression tests for deepseek-r1

The previous fix removed 'deepseek' from the keyword allow-list, but
_is_api_model is still True for localhost endpoints because 'localhost'
appears in _API_HOSTS — so the keyword change had no effect for Ollama.

Proper fix: add an explicit _model_no_tools blocklist ('deepseek-r1')
that overrides the endpoint URL check. The endpoint's supports_tools DB
flag still takes priority either way (True forces tools on, False forces
them off), so users can override per-endpoint when needed.

Also refined the deepseek allow-list: 'deepseek-v' and 'deepseek-chat'
cover the cloud models (v2, v3, chat) that do support tools, without
matching deepseek-r1 variants.

13 regression tests cover:
- deepseek-r1 on localhost/docker: no tools (was HTTP 400)
- deepseek-v3/chat on api.deepseek.com: tools enabled (no regression)
- endpoint_supports=True/False overrides both lists
- qwen/llama on localhost: unaffected
2026-06-02 23:22:57 +09:00
Zarl-prog
b89141679f fix(cookbook): scroll serve panel into view when expanded (#1180) (#1191) 2026-06-02 23:21:35 +09:00
spooky
f667667da3 fix: distinguish external cookbook runtimes (#1188) 2026-06-02 23:20:00 +09:00
PrabinDevkota
6b7dd4ea28 fix(auth): case-insensitive owner migration on username rename (#1183)
Use func.lower() when updating SQL owner columns, match prefs keys
case-insensitively, and normalize session usernames before comparing
during rename. Prevents silently skipping legacy mixed-case owner data.

Fixes #1165
2026-06-02 23:18:15 +09:00
spooky
5b87e69221 feat: add vllm kv cache dtype option (#1185) 2026-06-02 23:17:16 +09:00
ghreprimand
7b43fa9372 Improve calendar event text contrast (#1184)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 23:14:52 +09:00
Ernest Hysa
c12ae79c42 fix(tools): strict path confinement with sensitive-subpath deny list (#1072)
Rework read_file / write_file confinement after review feedback:

- Remove $HOME from default allow roots. Only project data/ and system
  temp dirs are allowed out of the box.
- Add a sensitive-subpath deny list (.ssh, .gnupg, shell rc files,
  .env, .netrc, SSH key filenames). Checked BEFORE allowlist so it
  blocks even when a broader root is configured.
- Add "tool_path_extra_roots" setting for opt-in broader access.
- Sensitive subpaths remain blocked regardless of configured roots.

Tests: 24 cases covering /etc/shadow, ~/.ssh/authorized_keys,
symlink into .ssh, traversal, shell rc files, key filenames,
extra roots, and dispatch-level end-to-end.
2026-06-02 23:13:30 +09:00
Shaw
16f7feee0a fix(hwfit): honor manual "metal" backend in the hardware simulator (#1090)
The Cookbook's manual hardware simulator ("what if I had this setup") let users
pick a backend, but _apply_manual_hardware only accepted cuda/rocm/cpu_x86/
cpu_arm and silently coerced anything else to cuda. So selecting Apple/Metal
simulated a CUDA box instead — and ranked safetensors-only repos a Mac can't
serve, even though the rest of hwfit (services.hwfit.fit, the serve-command
generation) already supports Metal as GGUF-only via llama.cpp/Ollama.

Add "metal" to the accepted backends (now a named _MANUAL_BACKENDS set, kept a
subset of what fit.py understands) and set unified_memory=True for it — Apple
Silicon shares one memory pool with the GPU — while clearing that flag for the
discrete (cuda/rocm) and CPU backends. _apply_manual_hardware is lifted to
module scope so it is directly unit-testable; both route call sites are
unchanged.

Adds tests/test_hwfit_manual_backend.py, including an end-to-end check that a
simulated Metal box only recommends GGUF-servable models.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:12:34 +09:00
red person
c7ddfd7dd2 Use shared IMAP timeout for account tests (#1088) 2026-06-02 23:11:04 +09:00
ghreprimand
21b6f9344e Normalize native select option theming (#1178)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 23:09:15 +09:00
RosenTomov
37356d8e3e Discover LM Studio via host/port scanning and native-API fingerprint (#1126)
Scan port 1234 and any custom port from LM_STUDIO_URL, add the LM_STUDIO_URL host to the discovery sweep alongside the Ollama env vars, and tag each discovered endpoint with its provider by fingerprinting the native /api/v1/models response (entries carrying key + architecture). Documents LM_STUDIO_URL in .env.example.
2026-06-02 23:04:58 +09:00
Jordan Urbs
c0c1ceb36d Treat Venice as a tool-capable SOTA cloud provider (#1173)
Follow-up to the Venice provider PR. Wire api.venice.ai into the three
host allowlists so Venice behaves like the other paid OpenAI-compatible
clouds:

- agent_loop: add api.venice.ai to _API_HOSTS so the agent sends native
  OpenAI tool-call schemas (Venice supports function calling) instead of
  degrading to fenced-block parsing.
- teacher_escalation: add api.venice.ai to _SOTA_HOSTS so the escalation
  loop stays OFF for Venice (it's a paid top-tier API; no need to add
  teacher-model latency).
- webhook_routes: add venice to KNOWN_PROVIDERS so the sync chat webhook
  can auto-resolve base_url from provider=venice.

Tests: tests/test_venice_hosts.py pins tool-host matching + SOTA
classification for Venice; py_compile on touched modules.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-02 23:03:46 +09:00
Mayank Ukey
3799dc102f fix: ICS export — escape X-WR-CALNAME and honour is_utc on DTSTART/DTEND (#1174)
Two bugs in the export_ics path:

1. X-WR-CALNAME was written raw: calendar names containing commas,
   semicolons or backslashes produced invalid ICS (RFC 5545 §3.3.11
   requires those characters to be escaped as \, \; and \\).
   Fix: wrap cal.name in the existing _ics_escape() helper, which is
   already used for SUMMARY, DESCRIPTION, and LOCATION on the lines
   immediately below.

2. DTSTART and DTEND on non-all-day events always emitted the naive
   ISO string (e.g. 20260602T100000) regardless of CalendarEvent.is_utc.
   Consumers treat a naive datetime as floating/local time, so UTC
   events imported into Google Calendar or Apple Calendar shifted by
   the user's timezone offset.  Fix: append 'Z' when is_utc is True,
   matching the pattern already used by the serialise_event() helper
   at line 408.
2026-06-02 23:02:28 +09:00
RosenTomov
a493fb49b0 Use LM Studio-reported vision capability for image passthrough (#1130)
Read a model's capabilities.vision flag from LM Studio's native /api/v1/models so vision finetunes whose names lack a vision keyword still receive images, falling back to the name heuristic when the endpoint doesn't report it. The probe is short-TTL cached and restricted to local/LAN hosts, so remote/cloud endpoints are never contacted.
2026-06-02 23:01:04 +09:00
spooky
18a445ba22 docs: add AMD Docker GPU preflight (#1168) 2026-06-02 22:54:08 +09:00
Shaw
4e769d537c fix(cookbook): detect llama-cpp-python via its real distribution name (#1020) (#1167)
The Cookbook → Dependencies tab reported llama-cpp-python[server] as "not
installed" even when it was installed and usable for serving. The local check
looked up distribution metadata as pkg["name"].replace("_", "-") — for the
import name `llama_cpp` that yields "llama-cpp", but the module ships in the
`llama-cpp-python` distribution. importlib.metadata.version("llama-cpp") then
raised PackageNotFoundError and the package was marked missing (the import
itself succeeds, which is why serving still worked).

Derive the distribution name from the package's declared pip spec instead
(stripping [extras] and version markers), falling back to the munged import
name only when no pip spec is declared. New _pip_dist_name() helper.

Adds tests/test_cookbook_package_detection.py covering the llama_cpp mapping,
extras/marker stripping, plain names, the no-pip-spec fallback, and that the
route wires the helper in (guarding against the exact regression).
2026-06-02 22:52:37 +09:00
ghreprimand
06a3468967 Surface deep research probe errors (#1086)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 22:51:25 +09:00
Tatlatat
dc8a882f1f fix(rag): use a stable hash for document IDs so dedup survives restarts (#1098)
add_document() and add_documents_batch() derive the persistent ChromaDB
document id from Python's built-in hash():

    doc_id = f"doc_{hash(text) % 10**16}"

str hashing is randomized per process (PYTHONHASHSEED is on by default), so
the same document text gets a different doc_id on every restart. The dedup
check right after — self._collection.get(ids=[doc_id]) — therefore misses
on restart, and identical documents are re-embedded and re-added as
duplicates each time the app restarts, bloating the vector store and
skewing retrieval.

Derive the id from a stable hashlib.sha256 of the text via a shared
_generate_doc_id() helper, used by both add paths so they agree.

tests/test_rag_vector_id_stability.py runs _generate_doc_id in subprocesses
under PYTHONHASHSEED=0/1/random and asserts the id is identical across all
of them (and differs for different text). Fails before this change.
2026-06-02 22:42:23 +09:00
pewdiepie-archdaemon
ff93a6c63b Polish email and cookbook flows 2026-06-02 22:42:07 +09:00
Afonso Coutinho
15a2662119 fix: markdown tables drop empty cells and misalign columns (#1164)
* refactor: extract splitTableRow helper for markdown tables

* fix: keep empty interior cells in markdown tables to preserve columns

* test: splitTableRow keeps empty interior cells
2026-06-02 22:41:27 +09:00
Povilas Kirna
6063fc51e0 docs: add THREAT_MODEL.md (#1111) 2026-06-02 22:40:37 +09:00
Léo
aeabd0e7f2 Load .env in start-macos.sh for APP_PORT and APP_BIND (#1008)
* Load .env in start-macos.sh for APP_PORT and APP_BIND

Parses .env at startup (consistent with how app.py reads it via
python-dotenv) so APP_PORT and APP_BIND are honoured without having
to retype them on the command line every run.

Resolution order: shell env (ODYSSEUS_PORT / ODYSSEUS_HOST) → .env
(APP_PORT / APP_BIND) → built-in defaults. Existing ODYSSEUS_* shell
overrides are fully preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Document .env support for APP_PORT and APP_BIND in macOS section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-02 22:39:30 +09:00
red person
028a39b42c Fix local Cookbook dependency installs in venvs (#1082) 2026-06-02 22:39:02 +09:00
Kenny Van de Maele
68efa8ee53 Fix docked-modal close: chat stays offset / reopen overlaps / no animation (#1158)
Docking a modal to a window edge pushes the chat aside (body padding via
right-dock-active + --right-dock-w). Three problems on close/reopen:

1. Chat stayed offset after closing a docked modal. The close-watcher only
   reacted to the `.hidden` class or DOM removal, but the draggable modals
   (calendar, plan, workspace, document, …) close via inline `display:none`.
   Watch the `style` attribute too and treat `display:none` as closed.

2. Reopening a previously-docked singleton modal floated it off to the side,
   overlapping the chat. The reused element kept its docked inline geometry.
   Clear the content's inline position/size on close so it reopens at its CSS
   default (centered).

3. Undock wasn't animated. The transition lived on `.right/left-dock-active`,
   so removing the class dropped the transition with it and padding snapped to
   0. Move the transition to the base `body` so the push animates both ways.

Files: static/js/modalSnap.js, static/style.css.
Checks: node --check static/js/modalSnap.js; verified in-browser (dock → close
→ chat animates back; reopen → centered, no overlap).
2026-06-02 22:38:20 +09:00
Robin Fröhlich
096468a29f fix: persist and display multimodal messages (image/audio attachments) (#1159)
Multimodal content (list of {type, text/image_url} blocks) couldn't be
stored in the DB Text column, causing silent persist failures. On reload
the frontend fell back to String() on the array, rendering
[object Object],[object Object] in the chat.

- Serialize list content as JSON in _persist_message()
- Deserialize back to list in _db_to_session() via _parse_msg_content()
- Extract text parts from multimodal arrays in sessions.js instead of
  String() coercion
2026-06-02 22:37:48 +09:00
red person
6bfe824eb4 Document self-host system requirements (#945) 2026-06-02 22:37:10 +09:00
Afonso Coutinho
5b12bf3f55 fix: ICS export doesn't escape commas/semicolons in event fields (#1161)
* fix: escape SUMMARY/LOCATION per RFC 5545 in ICS export

* fix: escape commas/semicolons in ICS DESCRIPTION, not just newlines

* test: ICS export escapes commas, semicolons, backslashes, newlines
2026-06-02 22:36:12 +09:00
Afonso Coutinho
2e2da2aefe fix: extract_statistics drops large numbers and trailing % signs (#1153)
* fix: extract_statistics misses comma-less numbers and drops trailing %

* fix: same extract_statistics number/percent bug in services copy

* test: extract_statistics captures full numbers and percent signs
2026-06-02 22:35:30 +09:00
Afonso Coutinho
2b2943a7b7 fix: extract_quotes accepts mismatched opening/closing quotes (#1113)
* fix: only extract quotes whose closing quote matches the opening one

* fix: same mismatched-quote bug in the services search copy

* test: extract_quotes requires matching open/close quotes
2026-06-02 22:34:52 +09:00
Hayk Arzumanyan
5236a62de1 fix: make landing page footer reachable past scroll-snap (#1118)
scroll-snap-type: y mandatory (docs/index.html:28) forces the viewport to
always rest on a snap point. The footer is far shorter than a viewport, so
scrolling down past the last min-height:100vh section snaps back to that
section's start and the footer can never settle in view. Switch the snap
type to 'proximity' so sections still snap when the user is near them but
the footer (and any sub-viewport tail) is freely reachable.

Fixes #8

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 22:33:17 +09:00
3ASiC
521848da75 fix(ui): don't submit chat message on Enter during IME composition (#1091)
CJK and other IME users confirm a candidate from the input-method popup by pressing Enter. The chat composer and the in-place message editor each bind a keydown handler that treats Enter (without Shift) as "submit", but they did not exclude the composition state. Pressing Enter to accept an IME candidate therefore sent the half-composed text (e.g. a stray "ce's") instead of just confirming the candidate.

These textareas intentionally hijack Enter to submit (Enter sends, Shift+Enter inserts a newline), which bypasses the browser's native form submission and the IME guard that comes with it, so the guard has to be re-added explicitly.

Add '&& !e.isComposing' to the three Enter-to-submit handlers: static/app.js (the main composer's button-submit path and its send/new-chat path) and static/js/chat.js (the editor for an already-sent message). Normal Enter (isComposing false) still submits; Shift+Enter still inserts a newline.

Tested: node --check on both files; manually verified with a Chinese IME that pressing Enter to pick a candidate no longer sends, and a message is sent only after composition ends.
2026-06-02 22:32:50 +09:00
ghreprimand
c075abce5d Search: consolidate core and provider implementations
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 21:02:26 +09:00
Leo
de92bbe47a Cookbook fit: steer consumer AMD to GGUF recommendations
* Cookbook fit: consumer-AMD GGUF recommendations + accurate estimates (core logic)

Split of #746 — the estimate/ranking MATH only, so it can be reviewed with tests
first (UI changes follow separately). Backend files only: no static/js here.

services/hwfit/fit.py, services/hwfit/hardware.py:
- Recommend GGUF/llama.cpp on consumer AMD (RDNA, gfx10/11/12) instead of
  formats that don't run on consumer Radeon — vLLM-only AWQ/GPTQ/FP8 AND
  vendor-specific NVFP4 (NVIDIA) / MLX (Apple). Datacenter Instinct (CDNA) and
  CUDA are left untouched.
- More accurate speed estimates across more GPUs (adds RDNA bandwidth data).
- Detect AMD/RDNA GPUs (gpu_family from rocminfo) so fit/serve can branch on it.

tests/test_hwfit_amd.py: AMD recommendation path, quant/bit matching, estimate
realism, gfx RDNA-vs-CDNA classification.

Rebased onto current main (analyze_model gained a scoring_use_case param there;
kept it). Vision detection intentionally NOT added here — main already ships a
"Vision" type filter + multimodal use-case handling; duplicating it was dropped.

Checks: py_compile clean; pytest tests/test_hwfit_amd.py + hwfit/serve suites
= 28 passed; full suite 0 new failures vs main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Tests: assert NVFP4/MLX/FP8 formats are filtered on consumer RDNA

Backs the #972 claim with an explicit regression: no NVIDIA NVFP4, Apple MLX,
or vLLM-only FP8/AWQ/GPTQ repos are recommended on a consumer Radeon, and guards
against vacuity by asserting such repos exist in the catalog.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 21:01:42 +09:00
red person
fd89d098a1 Chat: use cached endpoint model ids before probing 2026-06-02 21:00:58 +09:00
red person
5029c8570e Chat: prefer active model for new desktop chats 2026-06-02 21:00:50 +09:00
ooovenenoso
bd2fa82c1e Cookbook: prefer ROCm for native llama.cpp bootstrap
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-02 20:59:44 +09:00
Robin Fröhlich
3c6ae3713e Models: add Z.AI coding endpoint and GLM vision detection 2026-06-02 20:59:17 +09:00
SurprisedDuck
934bca9e48 Providers: omit temperature for OpenAI reasoning models
* fix: omit temperature for OpenAI reasoning models (o1/o3/o4/gpt-5)

These models only accept the default temperature; sending any explicit
value (even 0.0) returns HTTP 400 "Only the default (1) value is
supported". This broke two paths:

- Endpoint probing in _probe_single_model hardcodes temperature: 0.0, so
  a perfectly valid o3/gpt-5 endpoint is reported as failing in the
  Model Endpoints health check.
- Chat/stream payloads send temperature unconditionally, so a non-default
  temperature preset 400s on these models.

The code already special-cases the same model family for
max_completion_tokens, so this adds a sibling _restricts_temperature()
helper and omits the field for those models, letting the API use its
required default. gpt-4.5 is intentionally excluded (not a reasoning
model; accepts temperature normally).

Adds tests/test_llm_core_temperature.py covering the predicate and the
synchronous payload builder.

* fix: also omit temperature for reasoning models on the direct-POST paths

The first commit only covered llm_call/llm_call_async/stream_llm and the
endpoint probe. Email auto-summary, urgency-less spam classification, the
email reply-summary endpoint, and gallery vision tagging build their
OpenAI payloads inline and POST them directly (requests/httpx), bypassing
llm_core — so a reasoning model configured there would still 400 on the
temperature field. These sites already branch on _uses_max_completion_tokens,
so they're the same class; added the matching _restricts_temperature guard.

gallery_routes also gains the max_completion_tokens branch it was missing,
so gpt-5 vision tagging works end to end.

Note: email_pollers urgency scoring goes through llm_call_async and was
already covered.
2026-06-02 20:58:33 +09:00
Nikita Rozanov
119075f368 Research: add configurable run timeout
Surfaces the research_run_timeout_seconds setting (added in #783) in
Settings → Research as a "Max Time" field, and lets 0 disable the
wall-clock cap entirely for long deep-research runs.

- settings.py: document that 0 disables the cap; default stays 1800s.
- research_handler.py: resolve 0 (or negative) to no timeout
  (asyncio.wait_for timeout=None); other values stay bounded to
  [60, 86400] as before.
- index.html / settings.js: "Max Time" input bound to
  research_run_timeout_seconds, validated to {0} ∪ [60, 86400], with
  copy making explicit that 0 = no limit (unbounded model/API cost).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 20:57:57 +09:00
Tushar-Projects
c3228f8b59 Background tasks: respect active session model fallback 2026-06-02 20:57:42 +09:00
Deniz
537b4bcff7 macOS app: force native arm64 uvicorn on Apple Silicon 2026-06-02 20:56:53 +09:00
Georgiy
34c81e5b16 Auth: use require_user for remaining guarded routes 2026-06-02 20:55:50 +09:00