odysseus

Author	SHA1	Message	Date
Shreyas S Joshi	7504fedb17	fix: surface reasoning_content when content is empty (thinking models) (#1233 ) Thinking models served via llama.cpp without --reasoning-format none (e.g. Qwen3, DeepSeek-R1) route all tokens into reasoning_content and return content="". Two call paths were silently broken: - llm_call / llm_call_async (non-streaming): hard-keyed data["choices"][0]["message"]["content"] raises KeyError or returns empty string, discarding the entire response. - stream_agent_loop end-of-round fallback: when full_response is empty but round_reasoning has content, the existing code replaced the response with the generic empty-response error message, discarding all reasoning tokens that were correctly accumulated during streaming. Fix: in both non-streaming paths use msg.get("content") or msg.get("reasoning_content") or "". In the streaming fallback, surface round_reasoning as the answer before falling through to the error path.	2026-06-03 01:41:24 +09:00
Afonso Coutinho	257f7ee7b2	fix: manage_tasks create handles an explicit null prompt without crashing (#1290 )	2026-06-03 01:40:21 +09:00
Afonso Coutinho	8852c7ea4a	fix: claim_ownerless actually claims ownerless documents (was a no-op self-update) (#1288 )	2026-06-03 01:38:38 +09:00
nickorlabs	c39d8db12a	fix(agent): make context-budget hard_max configurable via agent_input_token_hard_max setting (#1273 ) Completes the reviewer requirement from PR #1190 review that was carried over but not implemented in #1230: > "The hard max is a function-local constant. For this setting, the ceiling > should be configurable or at least represented as a named setting/default > with tests." — review on #1190 #1230 shipped the adaptive auto-derivation but left `DEFAULT_HARD_MAX = 200_000` as a hardcoded module constant in src/context_budget.py. Admins on premium APIs with large context windows (kimi-k2 / minimax-m3 at 1M, etc.) can use their full window today only by setting `agent_input_token_budget` explicitly — which then takes them off the adaptive auto-path entirely. ## What this PR changes - src/settings.py: register `agent_input_token_hard_max` in DEFAULT_SETTINGS, default 200_000 (matches `DEFAULT_HARD_MAX`). Inline comment documents the no-op semantics in the explicit branch. - src/agent_loop.py: read the setting at the call site and pass it as the `hard_max` kwarg of `compute_input_token_budget`. Defensive parsing — missing / non-int / zero values fall back to `DEFAULT_HARD_MAX`, so a misconfig cannot silently zero the budget. - src/tool_implementations.py: three friendly aliases for `manage_settings`: - "hard max" -> agent_input_token_hard_max - "token budget cap" -> agent_input_token_hard_max - "input budget cap" -> agent_input_token_hard_max Plus the existing "token budget" -> agent_input_token_budget keeps a matching shorter alias "input budget". - tests/test_context_budget.py: 6 new tests on top of the existing 6: - hard_max raises the auto ceiling (1M ctx + raised cap -> 85% of ctx) - hard_max lowers the auto ceiling (128K ctx + 50K cap -> 50K) - hard_max has no effect on the explicit branch - DEFAULT_SETTINGS contains the new key - manage_settings aliases are registered - the live get_setting path returns the override value, and malformed values fall back per the agent_loop defensive parsing 12 passed in 0.04s. No changes to the pure helper signature or semantics; #1230's behavior is the default when the new setting is unset. ## How it lets users drop the explicit override Before this PR, on a 1M-context model: agent_input_token_budget = 900_000 (explicit) -> 900K [user override] agent_input_token_budget = <unset> (auto) -> 200K [HARD_MAX] After this PR, same model: agent_input_token_budget = <unset> agent_input_token_hard_max = 900_000 -> min(1M * 0.85, 900K) = 850K [auto, no override needed] The explicit-override path keeps working unchanged for users who prefer it.	2026-06-03 01:36:57 +09:00
Afonso Coutinho	3505a5ff27	fix: list_emails honors unresponded_only without requiring unread_only (#1287 )	2026-06-03 01:35:00 +09:00
Afonso Coutinho	926a4c59cb	fix: 2FA bypassed when enabled but TOTP secret is missing (fail-open) (#1286 ) * fix: fail closed when 2FA is enabled but the TOTP secret is missing * test: totp_verify fails closed when secret missing, passes when 2FA off	2026-06-03 01:26:47 +09:00
Afonso Coutinho	65751186bd	fix: merging consecutive user messages corrupts multimodal (image) content (#1277 ) * fix: preserve multimodal content blocks when merging consecutive user messages * test: consecutive user-message merge keeps multimodal image blocks	2026-06-03 01:21:57 +09:00
Afonso Coutinho	83aa35b83e	fix: owner-less document query passes bare False to SQLAlchemy filter() (#1281 ) * fix: use SQL false() for owner-less document query (filter(False) raises in SQLAlchemy 2.x) * test: owner-less document query doesn't pass a bare False to filter	2026-06-03 01:20:43 +09:00
Afonso Coutinho	a3b3dbafde	fix: uploaded files with no extension become permanently unresolvable (#1275 ) * fix: accept extensionless upload ids so files like Dockerfile resolve * test: upload id validation accepts extensionless ids	2026-06-03 01:16:30 +09:00
Afonso Coutinho	f62d6ea3d7	fix: research query misclassifies 'whatsapp'/'however' as questions (#1247 ) * fix: detect question words as whole words, not prefixes * fix: same question-word prefix bug in the services search copy * test: question-word detection rejects prefix lookalikes	2026-06-03 01:10:06 +09:00
Afonso Coutinho	311f226d44	fix: calendar check-in digest drops events 7-8 days out (#1249 ) * fix: close 1-day gap in calendar digest windows (events ~7-8 days out) * test: calendar digest windows are contiguous and cover 7-8 day events	2026-06-03 01:03:58 +09:00
Paulo Victor Cordeiro	44e0259163	fix: fire-reminder endpoint crashes with NameError on _gcu (#1250 ) dispatch_reminder call on line 699 references _gcu(request) which is never defined. The local helper wrapping get_current_user is _owner. Every POST to /api/notes/fire-reminder raises NameError and returns 500.	2026-06-03 01:02:25 +09:00
red person	aa420e2060	Ignore stale duplicate upload rows (#1256 )	2026-06-03 00:59:01 +09:00
Afonso Coutinho	a04553013d	fix: Anthropic responses with multiple text blocks lose all but the first (#1255 ) * fix: concatenate all Anthropic text blocks, not just the first * test: Anthropic response parsing concatenates text blocks	2026-06-03 00:57:20 +09:00
red person	a901992d03	Ignore non-object vault config (#1258 )	2026-06-03 00:55:04 +09:00
Shreyas S Joshi	b29c200801	fix(mcp): invalidate tool prompt cache on connect/disconnect/error (#1235 ) * fix(mcp): invalidate tool prompt cache on connect/disconnect/error get_tool_descriptions_for_prompt cached its result keyed only on (disabled_map, len(_tools)). If a server reconnects with the same tool count (or transitions to error state), the cache was never busted — the agent received stale tool descriptions for the new connection state. Add a _generation counter incremented on every structural change (successful connect, disconnect, connection error) and include it in the cache key. * test(mcp): regression test for _generation cache invalidation	2026-06-03 00:49:29 +09:00
ghreprimand	77320b617f	Fix owner-scoped skill updates (#1240 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 00:42:56 +09:00
Afonso Coutinho	35fa022e2e	fix: email pre-retrieval ignores contacts (reads non-existent email/phone keys) (#1241 ) * fix: match known email senders against the contact 'emails' list * fix: build contact-match snippets from emails/phones lists	2026-06-03 00:39:31 +09:00
Afonso Coutinho	3137ee4946	fix: theme color parsing breaks on #rgb shorthand hex (#1213 ) * refactor: add pure hexToRgb helper that handles #rgb shorthand * fix: handle #rgb shorthand hex in theme color parsing * test: hexToRgb expands shorthand and rejects invalid input	2026-06-03 00:30:03 +09:00
Afonso Coutinho	203c4d83df	fix: search analytics crashes recording when the JSON file predates a counter (#1224 ) * refactor: single _default_analytics() instead of duplicated default dicts * fix: merge analytics defaults so an old/partial file doesn't KeyError on record * test: analytics load merges defaults; record survives a partial file	2026-06-03 00:26:37 +09:00
lekt8	975fd42e32	fix: rank recency by UTC, not local time (#1116 ) (#1234 ) src/search/ranking.py computed result age as `(datetime.now() - dt).days`, where `dt` is parsed from a UTC-style published date with no timezone. Using local `datetime.now()` skewed the age by the host's UTC offset (off-by-up-to-a-day near boundaries), and was a latent crash: once neighbouring code becomes timezone-aware the naive/aware subtraction raises TypeError (the landmine called out in #1116). Recency is now measured against naive UTC. The scoring is also lifted out of the rank_search_results closure into a module-level, time-injectable `recency_score` so it's unit-testable, and `_utcnow_naive()` avoids `datetime.utcnow()` (removed in Python 3.14). Covered by tests/test_search_ranking_recency.py (5 cases); the existing tests/test_search_ranking.py still passes. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 00:18:15 +09:00
lekt8	8c376d2b0e	feat: adapt agent_input_token_budget to the model context window (#1170 ) (#1230 ) The agent soft-trims input context to `agent_input_token_budget` (default 6000). The old computation `min(context_length or budget, budget)` made the 6000 default a hard ceiling for every model, so 128K/1M context models were silently capped at 6000 input tokens — now that num_ctx is sent correctly (#1056), this was the last barrier to actually using a long context window. This derives the default budget from the model's discovered context window (~85%, capped at a generous hard max) while honouring an explicit user setting exactly (clamped to the window). When the window is unknown it falls back to the previous value, so behaviour is unchanged for that case. - src/context_budget.py: pure `compute_input_token_budget()` (unit-testable) - src/settings.py: `is_setting_overridden()` to tell an explicit user value from the merged default (load_settings merges DEFAULT_SETTINGS, so equality alone can't distinguish them) - src/agent_loop.py: use the helper in the soft-trim path Covered by tests/test_context_budget.py (6 cases). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 00:13:53 +09:00
ghreprimand	1fda906407	Fix Cookbook container-local model endpoints (#1223 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 00:09:48 +09:00
spooky	37f5635f8f	feat: show serve runtime readiness (#1209 )	2026-06-03 00:01:00 +09:00
ghreprimand	e72b9a8a95	Fix stale deleted sessions in sidebar (#1203 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 23:52:22 +09:00
lekt8	87babb58d5	fix: SSRF hardening for the custom embedding endpoint URL (#132 ) (#1206 ) POST /api/embeddings/endpoint takes a user-supplied URL and immediately makes an outbound httpx request to it with no validation. The admin gate added earlier (PR #80) closed the unauthenticated-access part of #132; this addresses the remaining request: validate the URL before fetching it. Odysseus is local-first, so pointing the embedding endpoint at a loopback or LAN server (local vLLM / llama.cpp / Ollama) is a normal setup — a blanket private-IP block would break the primary use case. So the guard: - always rejects non-HTTP(S) schemes (file://, gopher://, ftp:// …), - always rejects the link-local range (169.254.0.0/16, incl. the cloud instance-metadata 169.254.169.254 exfil vector) plus multicast / reserved / unspecified, and IPv4-mapped-IPv6 forms of the above, - keeps loopback/LAN allowed by default, and - adds EMBEDDING_BLOCK_PRIVATE_IPS=true for full SSRF lockdown on exposed multi-tenant deployments. Logic lives in src/url_safety.py (stdlib only, resolver injectable) so it is unit-testable without real DNS; the route calls it before the health-check request. Covered by tests/test_url_safety.py (8 cases). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 23:46:33 +09:00
red person	258e6fc0d4	fix(ui): allow manual prompt bar resize (#1201 )	2026-06-02 23:43:53 +09:00
red person	42ae905df7	fix(models): clear deleted endpoint fallback refs (#1207 )	2026-06-02 23:41:04 +09:00
red person	cc6e43da44	Report provider-specific search API keys correctly (#1202 ) * fix(search): report provider-specific API keys * fix(search): include provider env keys in status	2026-06-02 23:37:15 +09:00
lekt8	f2f437f4a8	feat: add /api/ready readiness probe (DB, data dir, local-first) (#1200 ) /api/health is a liveness ping. This adds /api/ready as a readiness / integrity self-check that returns 503 unless every critical subsystem is whole, so an orchestrator (Docker/Compose/k8s) can gate traffic on real readiness rather than mere process liveness: - database: opens a connection and runs SELECT 1 - data_dir: confirms the data directory exists and is writable - local_first: reports whether storage stays on the host (informational; a remote database is a valid deployment, so it never fails readiness) The check logic lives in src/readiness.py so it is unit-testable in isolation; the route is a thin wrapper. Covered by tests/test_readiness.py. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 23:33:22 +09:00
red person	76a7685105	fix(models): clear stale speech endpoint settings (#1196 )	2026-06-02 23:32:01 +09:00
red person	69ab350919	fix(ui): keep minimized windows above composer (#1197 )	2026-06-02 23:31:09 +09:00
red person	0db441b191	fix(ui): contain email split divider (#1194 )	2026-06-02 23:28:24 +09:00
Mayank Ukey	f96edfe5ca	fix: deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent (#1169 ) * fix: exclude deepseek from local tool-calling keyword list deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent. The cloud API (api.deepseek.com) is already caught by the _API_HOSTS check, so the generic 'deepseek' keyword match was only causing false positives for local Ollama-served models. * fix: add model no-tools blocklist and regression tests for deepseek-r1 The previous fix removed 'deepseek' from the keyword allow-list, but _is_api_model is still True for localhost endpoints because 'localhost' appears in _API_HOSTS — so the keyword change had no effect for Ollama. Proper fix: add an explicit _model_no_tools blocklist ('deepseek-r1') that overrides the endpoint URL check. The endpoint's supports_tools DB flag still takes priority either way (True forces tools on, False forces them off), so users can override per-endpoint when needed. Also refined the deepseek allow-list: 'deepseek-v' and 'deepseek-chat' cover the cloud models (v2, v3, chat) that do support tools, without matching deepseek-r1 variants. 13 regression tests cover: - deepseek-r1 on localhost/docker: no tools (was HTTP 400) - deepseek-v3/chat on api.deepseek.com: tools enabled (no regression) - endpoint_supports=True/False overrides both lists - qwen/llama on localhost: unaffected	2026-06-02 23:22:57 +09:00
Zarl-prog	b89141679f	fix(cookbook): scroll serve panel into view when expanded (#1180 ) (#1191 )	2026-06-02 23:21:35 +09:00
spooky	f667667da3	fix: distinguish external cookbook runtimes (#1188 )	2026-06-02 23:20:00 +09:00
PrabinDevkota	6b7dd4ea28	fix(auth): case-insensitive owner migration on username rename (#1183 ) Use func.lower() when updating SQL owner columns, match prefs keys case-insensitively, and normalize session usernames before comparing during rename. Prevents silently skipping legacy mixed-case owner data. Fixes #1165	2026-06-02 23:18:15 +09:00
spooky	5b87e69221	feat: add vllm kv cache dtype option (#1185 )	2026-06-02 23:17:16 +09:00
ghreprimand	7b43fa9372	Improve calendar event text contrast (#1184 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 23:14:52 +09:00
Ernest Hysa	c12ae79c42	fix(tools): strict path confinement with sensitive-subpath deny list (#1072 ) Rework read_file / write_file confinement after review feedback: - Remove $HOME from default allow roots. Only project data/ and system temp dirs are allowed out of the box. - Add a sensitive-subpath deny list (.ssh, .gnupg, shell rc files, .env, .netrc, SSH key filenames). Checked BEFORE allowlist so it blocks even when a broader root is configured. - Add "tool_path_extra_roots" setting for opt-in broader access. - Sensitive subpaths remain blocked regardless of configured roots. Tests: 24 cases covering /etc/shadow, ~/.ssh/authorized_keys, symlink into .ssh, traversal, shell rc files, key filenames, extra roots, and dispatch-level end-to-end.	2026-06-02 23:13:30 +09:00
Shaw	16f7feee0a	fix(hwfit): honor manual "metal" backend in the hardware simulator (#1090 ) The Cookbook's manual hardware simulator ("what if I had this setup") let users pick a backend, but _apply_manual_hardware only accepted cuda/rocm/cpu_x86/ cpu_arm and silently coerced anything else to cuda. So selecting Apple/Metal simulated a CUDA box instead — and ranked safetensors-only repos a Mac can't serve, even though the rest of hwfit (services.hwfit.fit, the serve-command generation) already supports Metal as GGUF-only via llama.cpp/Ollama. Add "metal" to the accepted backends (now a named _MANUAL_BACKENDS set, kept a subset of what fit.py understands) and set unified_memory=True for it — Apple Silicon shares one memory pool with the GPU — while clearing that flag for the discrete (cuda/rocm) and CPU backends. _apply_manual_hardware is lifted to module scope so it is directly unit-testable; both route call sites are unchanged. Adds tests/test_hwfit_manual_backend.py, including an end-to-end check that a simulated Metal box only recommends GGUF-servable models. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 23:12:34 +09:00
red person	c7ddfd7dd2	Use shared IMAP timeout for account tests (#1088 )	2026-06-02 23:11:04 +09:00
ghreprimand	21b6f9344e	Normalize native select option theming (#1178 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 23:09:15 +09:00
RosenTomov	37356d8e3e	Discover LM Studio via host/port scanning and native-API fingerprint (#1126 ) Scan port 1234 and any custom port from LM_STUDIO_URL, add the LM_STUDIO_URL host to the discovery sweep alongside the Ollama env vars, and tag each discovered endpoint with its provider by fingerprinting the native /api/v1/models response (entries carrying key + architecture). Documents LM_STUDIO_URL in .env.example.	2026-06-02 23:04:58 +09:00
Jordan Urbs	c0c1ceb36d	Treat Venice as a tool-capable SOTA cloud provider (#1173 ) Follow-up to the Venice provider PR. Wire api.venice.ai into the three host allowlists so Venice behaves like the other paid OpenAI-compatible clouds: - agent_loop: add api.venice.ai to _API_HOSTS so the agent sends native OpenAI tool-call schemas (Venice supports function calling) instead of degrading to fenced-block parsing. - teacher_escalation: add api.venice.ai to _SOTA_HOSTS so the escalation loop stays OFF for Venice (it's a paid top-tier API; no need to add teacher-model latency). - webhook_routes: add venice to KNOWN_PROVIDERS so the sync chat webhook can auto-resolve base_url from provider=venice. Tests: tests/test_venice_hosts.py pins tool-host matching + SOTA classification for Venice; py_compile on touched modules. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-02 23:03:46 +09:00
Mayank Ukey	3799dc102f	fix: ICS export — escape X-WR-CALNAME and honour is_utc on DTSTART/DTEND (#1174 ) Two bugs in the export_ics path: 1. X-WR-CALNAME was written raw: calendar names containing commas, semicolons or backslashes produced invalid ICS (RFC 5545 §3.3.11 requires those characters to be escaped as \, \; and \\). Fix: wrap cal.name in the existing _ics_escape() helper, which is already used for SUMMARY, DESCRIPTION, and LOCATION on the lines immediately below. 2. DTSTART and DTEND on non-all-day events always emitted the naive ISO string (e.g. 20260602T100000) regardless of CalendarEvent.is_utc. Consumers treat a naive datetime as floating/local time, so UTC events imported into Google Calendar or Apple Calendar shifted by the user's timezone offset. Fix: append 'Z' when is_utc is True, matching the pattern already used by the serialise_event() helper at line 408.	2026-06-02 23:02:28 +09:00
RosenTomov	a493fb49b0	Use LM Studio-reported vision capability for image passthrough (#1130 ) Read a model's capabilities.vision flag from LM Studio's native /api/v1/models so vision finetunes whose names lack a vision keyword still receive images, falling back to the name heuristic when the endpoint doesn't report it. The probe is short-TTL cached and restricted to local/LAN hosts, so remote/cloud endpoints are never contacted.	2026-06-02 23:01:04 +09:00
spooky	18a445ba22	docs: add AMD Docker GPU preflight (#1168 )	2026-06-02 22:54:08 +09:00
Shaw	4e769d537c	fix(cookbook): detect llama-cpp-python via its real distribution name (#1020 ) (#1167 ) The Cookbook → Dependencies tab reported llama-cpp-python[server] as "not installed" even when it was installed and usable for serving. The local check looked up distribution metadata as pkg["name"].replace("_", "-") — for the import name `llama_cpp` that yields "llama-cpp", but the module ships in the `llama-cpp-python` distribution. importlib.metadata.version("llama-cpp") then raised PackageNotFoundError and the package was marked missing (the import itself succeeds, which is why serving still worked). Derive the distribution name from the package's declared pip spec instead (stripping [extras] and version markers), falling back to the munged import name only when no pip spec is declared. New _pip_dist_name() helper. Adds tests/test_cookbook_package_detection.py covering the llama_cpp mapping, extras/marker stripping, plain names, the no-pip-spec fallback, and that the route wires the helper in (guarding against the exact regression).	2026-06-02 22:52:37 +09:00
ghreprimand	06a3468967	Surface deep research probe errors (#1086 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 22:51:25 +09:00

1 2 3 4 5 ...

437 Commits