odysseus

Author	SHA1	Message	Date
red person	d0c925f6c8	Chat attachments: allow picker to choose any file type	2026-06-02 20:55:30 +09:00
ghreprimand	aa0a9e8b5a	Search: align service content extraction Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 20:53:07 +09:00
ghreprimand	eddb9ce6db	Search: align service provider guards Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 20:52:13 +09:00
Leo	6c15dc7d33	Chat metrics: surface backend generation speed * Chat metrics: show backend's true generation t/s, not tokens÷wall-clock The per-message tokens/sec read low and felt wrong because it was computed as output_tokens / total_duration, where total_duration is wall-clock including prefill, tool calls, and network — not pure decode time. llama.cpp already reports the correct gen speed in its stream (timings.predicted_per_second), but it was being dropped. - llm_core.py: when parsing the OpenAI-compatible usage chunk, also read the sibling `timings` block llama.cpp includes — pass predicted_per_second through as gen_tps and prompt_per_second as prefill_tps on the usage event. - agent_loop.py: capture backend_gen_tps/backend_prefill_tps from usage events; in _compute_final_metrics prefer backend_gen_tps over the wall-clock division when present (fall back to computed for cloud APIs that omit timings). Tag the result with tps_source ("backend" vs "computed") and surface prefill_tps. Result: the displayed t/s now matches the model's real decode speed and is stable regardless of prompt length (a long prefill no longer deflates it). Checks: py_compile passes; verified extraction against a real llama.cpp final chunk (gen 79 t/s surfaced vs the deflated wall-clock figure shown before). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Chat metrics: surface true t/s on the direct-chat path too Follow-up to the gen-tps work: the non-agent direct-chat stream path in chat_routes turned the raw `usage` event straight into a metrics event but only copied token counts — it never set tokens_per_second or response_time. So simple (non-tool) replies showed "Speed: n/a" / "Time: undefineds" and the chip fell back to a bare token count ("27 tok") instead of t/s. Map the usage event's gen_tps (llama.cpp timings.predicted_per_second, added in the prior commit) into tokens_per_second here too, tag tps_source=backend, and set response_time from wall-clock for the stats popup. Checks: py_compile passes; verified llama.cpp emits usage+timings on the final stream chunk (gen ~90 t/s) that this path consumes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Tests: backend gen/prefill t/s passthrough and preference Cover the two pieces of the true-t/s metric so it can be reviewed on its own: - stream_llm surfaces llama.cpp's timings.predicted_per_second / prompt_per_second as gen_tps / prefill_tps on the usage event (captured llama.cpp final-chunk fixture), and omits them when the backend reports no timings. - _compute_final_metrics prefers backend_gen_tps over output/wall-clock, tags tps_source ("backend" vs "computed"), and surfaces prefill_tps. Reuses the fake-client stream harness from test_llm_core_streaming.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 20:52:08 +09:00
ghreprimand	4cec31d988	Chat: route image sessions only to matching image endpoints Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 20:52:03 +09:00
Ernest Hysa	064c1ace91	Uploads: write uploads index atomically * fix(upload): atomic-rename writes for uploads.json + .bak recovery UploadHandler.save_upload does a read-modify-write of uploads.json via two open(..., 'w') + json.dump blocks, with no lock, no temp+rename, and no recovery. N concurrent inserts lost N-1 entries (last writer wins after the read snapshot is taken); a SIGKILL/SIGTERM mid-json.dump truncated the file and the bare 'except Exception: logger.warning(...)' recovery path returned {}, silently dropping every prior upload. The handler now serialises the RMW under a per-instance threading.Lock and writes through _atomic_write_json, which writes to a tempfile in the same directory, fsyncs, snapshots the previous live to .bak, and renames the temp onto the target via os.replace. os.replace is atomic on POSIX, so a reader sees either the old or the new state, never a half-written file. _load_upload_index tries the live file first, then falls back to the .bak sibling if the live is corrupt. Cross-process safety is still on the deployer: gunicorn workers on the same uploads dir will race the lock, and the atomic-rename is the kernel-level guarantee that prevents torn reads. If multi-worker writes are expected, fcntl.flock around the rename is a follow-up; single-worker and async deployments are correct as-is. * fix(upload): reload uploads.json inside _index_lock on dedupe path The duplicate-detection branch in save_upload() was reading uploads.json before taking _index_lock, then writing that stale snapshot under the lock. A duplicate upload racing with a new-entry insert could clobber the new entry because the duplicate's snapshot predated the insert. The new-entry branch already reloaded inside the lock; the duplicate branch now does the same. It also re-resolves the storage key inside the lock, because a concurrent insert can have changed the dict's keys. If the entry has been cleaned up between the outer read and the inner write, the function falls through to the fresh-insert path instead of silently writing a stale row. Boundary note: the _index_lock serialises writers within a single Python process. Cross-process / multi-worker deployments still need flock or a database; the inline comment is updated to make this explicit. The atomic-rename write keeps the on-disk state consistent but does not serialise writers across processes. Tests: - Existing concurrent-insert and partial-write-recovery tests still pass. - New test_atomic_write_primitives_present_in_production_code asserts the production module has at least two 'with self._index_lock:' blocks (regression net for this fix). - New smoke tests: normal upload, duplicate detection, info lookup after a backup-recovery scenario.	2026-06-02 20:51:39 +09:00
Shaw	db10c8d95b	Sessions: allow deleting memory-only ghost sessions A session that exists only in the in-memory SessionManager — never persisted, or whose DB row was removed out-of-band — was listed by GET /api/sessions (the list is built from the in-memory manager) but 404'd on every per-session operation, so it could never be deleted. Two causes, both fixed: 1. _verify_session_owner() only consulted the DB and raised 404 when no row existed. It now falls back to the in-memory session's owner when (and only when) a session_manager is supplied and the caller actually owns the ghost. The DB row stays authoritative when present, and a ghost owned by another user still 404s, so the ownership/security model is unchanged. The new parameter defaults to None, preserving behavior for all other callers. 2. SessionManager.delete_session() only removed the in-memory entry when a DB row was found, so memory-only ghosts survived. It now drops the in-memory copy regardless and reports success when either the DB row or the in-memory entry was removed. Added tests/test_session_ghost_delete.py covering both layers, including the cross-owner 404, the unauthenticated 403, DB-row-wins precedence, and backward compatibility when no manager is passed. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:51:26 +09:00
mechramc	8e87d3002b	Tasks: clean up queued cancellation state	2026-06-02 20:51:21 +09:00
SurprisedDuck	f975279b26	Notes: parse natural-language due dates on update The 'add' action runs due_date through parse_due_for_user (natural language like 'tomorrow at 9am', plus user-tz anchoring for naive ISO), but 'update' stored the raw value verbatim. A reminder edited with natural language was saved as an unparseable literal the frontend's new Date() can't read, so it never fired. Route update's due_date through the same parser as add.	2026-06-02 20:51:16 +09:00
mechramc	8efd7b3df6	Windows: improve Git Bash detection	2026-06-02 20:45:48 +09:00
red person	4709bb022e	Windows: add Docker update script	2026-06-02 20:45:32 +09:00
Tatlatat	7f97ab3032	Topics: hydrate session history before analysis analyze_topics() iterates session_manager.sessions and reads session_data.get("history", []) directly. But SessionManager.load_sessions seeds sessions metadata-only with empty history — messages are loaded lazily, only when get_session(session_id) is called. So analyze_topics saw empty history for every session that hadn't been individually opened this process lifetime and reported total_topics: 0, even when the database held plenty of matching messages. Hydrate each candidate session via session_manager.get_session(session_id) (the existing lazy-load path) before reading its history, after the owner/archived filters so skipped sessions aren't loaded. Falls back to the raw cached history when the manager has no get_session (test stubs). tests/test_topic_analyzer.py: new test_topic_analyzer_hydrates_sessions seeds a real SQLite DB with a session + message, runs the real SessionManager (asserting cached history starts empty), then asserts analyze_topics finds the topic. Fails before this change. The existing keyword tests now pass an explicit owner to satisfy the owner-required early return.	2026-06-02 20:44:27 +09:00
SurprisedDuck	d73c0a13f4	YouTube: enforce comment fetch timeout while waiting asyncio.wait_for wrapped create_subprocess_exec, which returns as soon as the child is spawned, so the timeout never bounded the actual work. yt-dlp could hang indefinitely on proc.communicate() and the except asyncio.TimeoutError branch was unreachable. Bind the wait to communicate() and kill/reap the child if it overruns.	2026-06-02 20:44:24 +09:00
Tatlatat	e084dc993e	Chat: merge consecutive user messages for strict providers After a non-native tool round, the agent appends tool results as a {role: 'user'} message next to the user's original 'user' prompt, producing two consecutive 'user' messages. Strict provider APIs (Anthropic/Claude) reject consecutive same-role messages, so the follow-up generation request fails silently — search returns sources, then nothing is generated. _sanitize_llm_messages now merges consecutive 'user' messages (joining their content). Only user/user is merged; normal chat and agent/tool turns already alternate and are untouched. Scoped down per maintainer review: the agent_loop 'output' source-extraction change is already on main (#898/#901) and the broad-mocking web-sources test was dropped. Added a focused test that runs consecutive-user messages through the real _build_anthropic_payload and asserts the payload alternates correctly.	2026-06-02 20:44:13 +09:00
Tatlatat	51cf63009e	TTS: include mp3 files in cache stats TTSService._put_cache writes .mp3 for MP3 audio (ID3/MPEG-framed bytes) and .wav otherwise, and the rest of the class treats both as cache entries (_get_cache iterates (".mp3", ".wav"); eviction globs "."). But get_stats() enumerated the cache with `glob("*.wav")` only, so both cache_entries and cache_size_mb undercounted — reporting 0 whenever the cache held MP3 files, which is the common case for most TTS providers. Glob both extensions so the reported stats match what's actually cached. tests/test_tts_cache_stats.py writes an MP3-headed blob via _put_cache and asserts get_stats() reports one entry with non-zero size. Fails before this change.	2026-06-02 20:43:29 +09:00
Tatlatat	3885f9fa90	STT: clean temp audio files on transcription failure STTService._transcribe_local writes the audio to a NamedTemporaryFile (delete=False) and only unlinks it on the success path, before the except. If model.transcribe() raises (corrupt audio, model/runtime error, etc.) the function logs, returns None, and leaves the .webm temp file behind — so every failed local transcription leaks a file in the system temp dir. Initialize tmp_path = None up front and move the unlink into a finally block so the temp file is cleaned up whether transcription succeeds or raises. tests/test_stt_leak.py stubs the whisper model to raise during transcribe, runs _transcribe_local, and asserts it returns None and leaves no new .webm file in the temp dir. Fails before this change.	2026-06-02 20:43:24 +09:00
Collin	f8e3bfeaff	Add endpoint probing behavior tests ROADMAP "Backend → more tests around endpoint probing and provider setup". TestSetupProbeSafety already covers _probe_endpoint's keyed/unkeyed curated fallback; this adds the rest of the probe surface, with httpx faked the same way (no network): - _probe_endpoint: OpenAI {"data"} vs native Ollama {"models"} list parsing, the /api/tags fallback for Ollama builds lacking /v1/models, and the no-models-found result. - _ping_endpoint (previously untested): 2xx reachable, auth failure (reached but not reachable), the /login-redirect "that's Odysseus, not a model server" trap, generic redirects, transport errors, and the native Ollama /api/version fallback. - _probe_single_model (previously untested): ok/fail/timeout status mapping, dict/string upstream error extraction, and OpenAI vs Anthropic request routing (x-api-key, /v1/messages, tool schema). - _classify_endpoint: the Tailscale CGNAT 100.64.0.0/10 local range and its boundaries.	2026-06-02 20:42:48 +09:00
Collin	e8dea7d456	Add provider classification and upstream-error tests ROADMAP "Backend → more tests around endpoint probing and provider setup" and the "Provider setup/probing audit" item. test_provider_endpoints.py covers URL/header building; this adds the provider-identification and degraded-state error reporting around it, against the real src.llm_core: - _detect_provider: host-based (not substring) provider matching, with look-alike-host and domain-in-path guards, and the OpenAI-compatible fallback that xAI / DeepSeek / Gemini correctly use. - _provider_label: human names used in error messages (incl. native vs cloud Ollama and the generic local-endpoint case). - _format_upstream_error: 401/403/404/429/5xx → provider-aware sentences, with JSON / string / plain-text / bytes body detail extraction. - _uses_max_completion_tokens: gpt-5 / o-series detection (gpt-4o stays on plain max_tokens).	2026-06-02 20:42:43 +09:00
Alexandre Teixeira	2e961cee93	tests: cover calendar route owner gates	2026-06-02 20:42:37 +09:00
Alexandre Teixeira	033e7a8f0d	tests: cover API token CRUD routes	2026-06-02 20:42:32 +09:00
Alexandre Teixeira	4bbffbfb05	tests: cover upload route owner gates	2026-06-02 20:42:26 +09:00
Alexandre Teixeira	6255852bef	tests: cover cleanup owner scope	2026-06-02 20:42:21 +09:00
Alexandre Teixeira	ff8b9e9ab6	tests: cover research route owner gates	2026-06-02 20:42:15 +09:00
Mihail Filippov	d92d6b5e67	Add tests for open-signup endpoint	2026-06-02 20:42:10 +09:00
Yavor Ivanov	7cc8fdb2f5	Models: avoid hidden models in default fallback Both get_default_chat and _recover_empty_session_model picked the first model from cached_models[0] without checking hidden_models. If the first cached model was hidden (e.g. minimax-m3), it was returned as the default or used to repair empty session models, even though the model list endpoints already filter hidden_models. - Add _visible_models() helper that filters cached_models by hidden_models (mirrors the filtering in list_model_endpoints) - Use _visible_models() in get_default_chat fallback (when no explicit default_model is saved) - Use _visible_models() in _recover_empty_session_model (when repairing a session whose model field is empty before chat send) - Add regression tests for hidden-model filtering in default chat resolution, and unit tests for _visible_models helper	2026-06-02 20:37:14 +09:00
Shaw	8115cb01a2	Models: allow API keys for local endpoints Self-hosted endpoints on a LAN are sometimes protected by an API key. The admin "Local" add/test form only sent base_url (+ model_type), so such an endpoint could not be added — it just errored out — even though the backend POST /api/model-endpoints and /model-endpoints/test already accept an optional api_key form field (the cloud "API" form already uses it). Adds an optional masked "API key" input (adm-epLocalApiKey) to the Local form and wires it into the local Test and Add handlers, sending api_key only when filled (an empty value is omitted so we never send a blank Bearer). The field is cleared after a successful add, matching the cloud form. Tested: tests/test_local_endpoint_api_key_js.py extracts the two click handlers and runs them under node with mocked DOM/FormData/fetch, asserting api_key is sent when the field is filled and omitted when blank, plus that the input exists as a password field. `node --check static/js/admin.js` passes. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:36:54 +09:00
Tatlatat	dac64f20d9	Text: strip dangling think blocks after visible text `strip_think` removes a dangling (unclosed) `<think>` block via `_THINK_OPEN_RE`, but that pattern was anchored to the start of the string (`^\s<think>`). An unclosed `<think>` (or `<thinking>`) opener that appears after* any leading output was therefore only half-handled: the stray tag itself was removed by `_THINK_TAG_RE`, but the reasoning content following it leaked straight to the user. strip_think("Hello! <think> I am thinking.") # -> "Hello! I am thinking." (leak) strip_think("Sure.\n<think>\nLet me reconsider...") # -> leaks the reasoning `strip_think` feeds user-facing output across research, email replies, notes, and scheduled tasks, so this leaks chain-of-thought to end users. Un-anchor `_THINK_OPEN_RE` so a dangling opener anywhere strips from the opener to end of string, consistent with the existing start-of-string behavior. Content before the opener, closed `<think>...</think>` blocks, and tag-free text are all preserved. tests/test_strip_think.py covers the mid-text leak (fails before this change), start-anchored unclosed, closed blocks, no-tag passthrough, content-before-opener, and mixed closed+unclosed. Full existing think suite still passes.	2026-06-02 20:36:37 +09:00
Tatlatat	8ad436d25a	DB: enable SQLite foreign key cascades * fix(db): enable SQLite foreign keys so ondelete cascades actually fire core/database.py declares DB-level FK actions throughout (ondelete="CASCADE" / "SET NULL"), but SQLite disables foreign-key enforcement per connection by default and the engine had no connect-event listener turning it on. So every one of those ondelete actions was dead. Concrete impact: cleanup_old_sessions() in src/cleanup_service.py removes old sessions with a bulk `query(Session).delete()`, which bypasses the ORM-level relationship cascade and relies solely on the DB-level ondelete="CASCADE" on ChatMessage.session_id. With foreign keys off, the messages are never deleted — they pile up as orphaned rows on every cleanup cycle. Add the standard SQLAlchemy connect listener issuing `PRAGMA foreign_keys=ON`, guarded by `isinstance(conn, sqlite3.Connection)` so it only affects SQLite and leaves other backends untouched. tests/test_sqlite_foreign_keys.py inserts a Session + ChatMessage, deletes the Session via bulk `query().delete()`, and asserts the ChatMessage is cascade-deleted. Fails before this change (orphan remains). * docs(db): clarify FK pragma scope per review; trim test comments Address review feedback on the foreign_keys PRAGMA change: - Note that the class-level connect listener fires for every Engine in the process and is a no-op on non-SQLite backends (isinstance guard). - Warn near init_db() that FK enforcement is now global, so a migration that temporarily violates FK constraints must disable foreign_keys around that work. - Drop the step-by-step narration comments from the regression test. No behavior change.	2026-06-02 20:36:13 +09:00
Tatlatat	bd78e1d5c2	Admin: wipe gallery albums with images The /api/admin/wipe/gallery branch deleted GalleryImage rows but left every GalleryAlbum row behind (GalleryAlbum wasn't even imported). After "wipe gallery" the user is left with orphaned, empty albums whose cover_id points at now-deleted images — inconsistent with the other wipe branches, which clear both parent and child tables. Delete GalleryAlbum alongside GalleryImage and include both in the returned count. Adds tests/test_admin_wipe_gallery.py: seeds a real in-memory SQLite DB with an album + image, runs the actual wipe handler, and asserts both tables are emptied. Fails before this change (albums survive).	2026-06-02 20:35:57 +09:00
SurprisedDuck	62f06ab740	Docs: respect path boundary when clearing exclusions add_directory cleared exclusions with a raw path.startswith(directory) test, which also matched sibling directories sharing a name prefix — adding /docs would silently un-exclude files under /docs2. Match the directory itself or paths under it (directory + os.sep) instead.	2026-06-02 20:35:44 +09:00
SurprisedDuck	78747b56ca	Documents: strip PDF marker without corrupting text _process_pdf prepends "\n\n[PDF content]:" to extracted text, and two call sites in document_routes.py stripped it with .lstrip("\n[PDF content]:"). str.lstrip(chars) treats its argument as a set of characters, so it keeps eating into the page text that follows the marker — e.g. a body starting with "to the board" loses its leading "to" because 't'/'o' are in the marker's character set. Replace both sites with a shared strip_pdf_content_marker() helper that uses str.removeprefix.	2026-06-02 20:35:27 +09:00
Ernest Hysa	996a2027dd	Cookbook: surface pip install failures in logs _pip_install_fallback_chain silently discarded pip stderr via 2>/dev/null on every attempt. When pip failed (network error, venv mismatch, disk full), the wrapper exited 0 and the Cookbook UI showed the download as running — the silent-failure mode from #354. Extract _pip_install_attempt() which wraps each pip invocation in a bash -c subshell that captures output to a temp file, prints tail -5 on failure, cleans up, and exits with pip's real exit code. This avoids the \| tail pipefail masking (the first blocker on #363) while surfacing the last 5 lines of pip output in the tmux log so users can see what went wrong. Both local wrapper and remote SSH runner use the same helper through _pip_install_fallback_chain, so the fix is symmetric.	2026-06-02 20:34:52 +09:00
Hayk Arzumanyan	514050d098	Models: rewrite Docker loopback endpoints to host gateway In Docker, a model-endpoint URL pointing at loopback (e.g. the LM Studio default http://localhost:1234/v1) targets the Odysseus container itself, not the host running the server, so the probe gets a connection error and the endpoint is rejected with a misleading 'No models found for that provider/key'. Rewrite loopback to host.docker.internal (which compose already maps to host-gateway) for the probe and the saved URL, mirroring the existing Ollama handling. Gated on actually being in a container with the gateway reachable, so native installs and gateway-less deploys are untouched. Fixes #25 Co-authored-by: Claude <noreply@anthropic.com>	2026-06-02 20:34:40 +09:00
SurprisedDuck	4307cac966	Research: report empty search provider results clearly Deep Research surfaced 'Error: unknown error' whenever every search provider returned an empty result set without raising (e.g. SearXNG is reachable but all its engines fail internally). _last_search_error was only set on exceptions, so the empty-but-no-exception path left it unset and the caller fell back to 'unknown error'. Record an actionable reason on that path naming the providers that were tried, so users can tell it's a search-backend problem rather than a model problem. The provider-raised path is unchanged. Re: #344. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:34:25 +09:00
Tatlatat	67517eaed1	Gallery: match image endpoint URLs with exact v1 suffix The image-edit endpoint lookup compared stored vs incoming base URLs with `.rstrip("/v1")`. `str.rstrip(chars)` treats its argument as a character set, not a suffix, so any URL ending in '/', 'v', or '1' is over-stripped (e.g. `http://host1/v1` -> `http://host`). Two endpoints that are not the same can then compare equal, or the real endpoint fails to match its own stored record, leaving `api_key` unset and sending the upstream image call unauthenticated. Use `.removesuffix("/v1")` (exact-suffix removal) with surrounding `.rstrip("/")` on both sides so only a genuine trailing `/v1` is dropped. Adds a focused test that parses the actual comparison expression out of gallery_routes.py via AST and evaluates it — it fails if the fix is reverted and uses no mocking.	2026-06-02 20:34:05 +09:00
SurprisedDuck	d06b6d87d3	Models: prefer longest known context match KNOWN_CONTEXT_WINDOWS lists 'o1' (200k) before 'o1-mini' (128k), and _lookup_known returned on the first substring hit — so "o1-mini" matched 'o1' and reported 200000 instead of 128000. Track the longest matching key instead, so the most specific entry wins regardless of table order.	2026-06-02 20:33:09 +09:00
mist	0b0be3c339	Email: recognize forwarded message dividers `_ORIG_RE` (and its JS mirror `_TALON_ORIG_RE`) already recognised the Japanese forward marker `転送` alongside the "Original Message" delimiters, but not the English "Forwarded message" one. So Gmail-style forwards — including the ones Odysseus itself emits (`---------- Forwarded message ----------`, static/js/emailInbox.js) — were not treated as a quote boundary: - with a following Outlook From:/Date: header block, the divider line leaked into the level-0 reply bubble as noise; - with only the divider marking the forward (no header block), the body was not split into turns at all. Add `Forwarded\s+message` to the same `[-_=]{3,}`-delimited alternation in both the server-side parser and the JS mirror, so forward dividers are consumed as an attribution boundary like "----- Original Message -----". Locale variants of "Forwarded message" can follow the existing pattern. Tests cover both manifestations plus a negative control (the bare words "forwarded message" without `[-_=]{3,}` delimiters must not split). Checks: python -m pytest tests/test_forwarded_message_divider.py (3 passed), python -m py_compile src/email_thread_parser.py, node --check static/js/emailLibrary/utils.js, git diff --check.	2026-06-02 20:32:56 +09:00
mist	e249fa4557	Tools: match keyword hints on word boundaries `get_tools_for_query` force-includes whole tool families when the query mentions an intent keyword, but matched with a raw substring test (`kw in ql`). Short hints therefore fired inside unrelated words, bloating the tool set with irrelevant tools: - "fix" matched "prefix" -> document tools - "line" matched "deadline"/"online" -> document tools - "serve" matched "observe"/"reserve" -> cookbook serve tools - "reply" matched "replying" -> all email tools - "unread" matched "unreadable" -> all email tools Match each keyword on word boundaries instead (`re.search(rf"\b{re.escape(kw)}\b", ql)`), the same fix already applied to the keyword matcher in topic_analyzer.py. Genuine intent keywords ("reply to this email", "edit the document", "serve the model") still match. This only removes substring-inside-a-word matches; it does not change whole -word matches (so e.g. an unrelated whole word like "tell" is a separate keyword-choice question, left untouched here). Checks: python -m pytest tests/test_tool_index_keyword_boundaries.py (4 passed; 3 of them fail on the pre-fix substring code), python -m py_compile src/tool_index.py, git diff --check.	2026-06-02 20:32:20 +09:00
mist	8f0518c0ae	Presets: fill missing built-in defaults on load PresetManager.load already heals a forward-incompatible presets.json: the block just above repairs the legacy `custom` shape and re-saves the file. But if the file exists and is missing a whole built-in preset (e.g. an older install written before `reason` existed), load returned it as-is, so that built-in stayed permanently absent — silently missing from the picker that GET /api/presets feeds, with no way for the user to get it back. Extend the same self-heal: after the legacy migration, fill in any built-in presets the loaded file is missing, defaults-first so user edits win, and persist the result. This never clobbers an intentional removal — there is no delete path for the built-in keys (only user_templates entries can be deleted), and presets are hidden via an `enabled: False` flag, not removal. Checks: python -m pytest tests/test_preset_fill_missing_defaults.py (3 passed; 2 fail on the pre-fix code), the existing preset cases in tests/test_review_regressions.py still pass, python -m py_compile src/preset_manager.py, git diff --check.	2026-06-02 20:32:08 +09:00
Mahdi Salmanzade	280c29d572	Security: owner-scope v1 chat endpoint fallback The sync-chat endpoint's Case 3 fallback selected a ModelEndpoint with an unscoped `query(ModelEndpoint).filter(is_enabled == True).first()` and then used that row's decrypted `api_key` for the LLM call. ModelEndpoint is a per-user resource (owner non-null = private to that user), so a chat-scoped API token for user A that sent no session and no api_key could fall back onto user B's PRIVATE endpoint — spending B's API key/quota and reaching whatever internal base_url B configured. This is the same multi-tenant owner-scoping class already fixed for the session gate on this very endpoint (_caller_owns_session) and for companion/models. Scope the fallback to the token owner's own rows plus legacy null-owner (shared) rows via the existing owner_filter helper, matching routes/model_routes.py and companion/routes.py. A null/empty owner stays a no-op, preserving single-user/legacy behaviour. Add regression tests pinning the scoped fallback (cross-owner, shared-only, no-visible-row, disabled-owned, and the legacy null-owner no-op).	2026-06-02 20:31:35 +09:00
Refuse	323f027865	Security: sanitize export and gallery filenames Co-authored-by: RefuseOdd <refuseodd@users.noreply.github.com>	2026-06-02 20:29:56 +09:00
Refuse	4218bfe71e	Tools: restrict app_api and serve_preset to admins Co-authored-by: RefuseOdd <refuseodd@users.noreply.github.com>	2026-06-02 20:29:47 +09:00
Lohinth	12ba535c7d	Companion: fix pairing admin guard import Co-authored-by: Lohinth <lohinth25@proton.me>	2026-06-02 20:29:37 +09:00
mechramc	493c815371	Chat: scope active document fallbacks by owner	2026-06-02 20:29:27 +09:00
Tatlatat	cd247ed107	Skills: delete owner-scoped skills with owner The DELETE /api/skills/{skill_id} handler resolves the caller, loads the skill with skills_manager.load(owner=user), and verifies ownership with _verify_owner(match, user) — but then calls skills_manager.delete_skill(match.get("name")) without the owner. SkillsManager.delete_skill filters candidates with `(sk.owner or "") != (owner or "")`, so when owner is None an owner-scoped skill is skipped and the method returns False. The route then raises a spurious 404 "Skill not found" — meaning a logged-in user can never delete their own skills through the API. Pass the resolved owner through to delete_skill so the skill is matched and removed. tests/test_skills_delete_owner.py drops a real owner-scoped SKILL.md on disk and (1) checks the manager directly: delete_skill without owner returns False (regression lock) while delete_skill(owner="alice") returns True and removes the dir; (2) drives the real DELETE route handler and asserts it returns {"ok": True} and deletes the file. The route test fails before this change (404). Real SkillsManager + real filesystem, no mocking.	2026-06-02 20:28:36 +09:00
Tatlatat	9389cabed0	API keys: skip undecryptable entries on load APIKeyManager.load() decrypts every stored key with a dict comprehension and no error handling. If the .key file no longer matches the ciphertext in api_keys.json — key rotated, a partial/!mismatched data restore, or a corrupted .key — Fernet.decrypt raises cryptography.fernet.InvalidToken. app_initializer.py calls api_key_manager.load() during startup, so a single undecryptable entry takes down the whole app at boot, and the user can't reach the UI to fix it. Decrypt each key in a loop and, on InvalidToken/ValueError, log a warning and skip that one entry while still returning every key that decrypts cleanly. One bad/stale key no longer blocks startup. tests/test_api_key_manager_resilience.py saves a valid key, then injects an entry encrypted under a different Fernet key (InvalidToken) and a malformed token (ValueError), and asserts load() returns the good key and skips the bad ones without raising. Fails before this change.	2026-06-02 20:28:26 +09:00
Tatlatat	da3876c168	Webhook: block IPv6 SSRF bypasses The webhook URL guard's _ip_is_private() only checks a hardcoded _PRIVATE_NETWORKS list, which misses several addresses that route internally. validate_webhook_url() therefore ALLOWED: - http://[::]/ (IPv6 unspecified, reaches localhost) - http://[::ffff:127.0.0.1]/ (IPv4-mapped IPv6 loopback = 127.0.0.1) - http://[::ffff:169.254.169.254]/ (IPv4-mapped cloud metadata endpoint) The last one is the dangerous case: a webhook pointed at the mapped 169.254.169.254 can pull cloud instance credentials (SSRF -> credential theft). Harden _ip_is_private(): first unwrap IPv4-mapped IPv6 to its embedded IPv4 (addr.ipv4_mapped), then reject via the stdlib address properties (is_private, is_loopback, is_link_local, is_reserved, is_multicast, is_unspecified) in addition to the existing network list. Public addresses still pass. tests/test_webhook_ssrf_resilience.py asserts validate_webhook_url raises for the three IPv6 bypasses plus 127.0.0.1 and 0.0.0.0, and still accepts a public IP literal. The IPv6 cases fail before this change.	2026-06-02 20:28:12 +09:00
ghreprimand	431b98525b	Email: persist bulk read state to provider Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-02 20:28:01 +09:00
Ernest Hysa	a8a34bd22a	Ollama: pass discovered num_ctx in chat requests _build_ollama_payload sends options.temperature and options.num_predict to /api/chat, but never options.num_ctx. Ollama defaults num_ctx to 2048 when the option is omitted, so prompts going to any Ollama backend are silently truncated there regardless of the model's actual capability. Thread the discovered context length through the three call sites (llm_call, llm_call_async, stream_llm) and emit options.num_ctx when it is known and positive. The builder filters out the DEFAULT_CONTEXT fallback (128000) so we don't lie to Ollama about models whose window we couldn't actually discover. The issue's literal 'when > 2048' heuristic is dropped: a model with a real context smaller than 2048 would OOM if Ollama used its default, so we pass the real value regardless of size. Matches how src/context_compactor.py uses the same helper. Sister fix to PR #753 — that PR teaches the compactor the right budget, this one tells Ollama to actually use that budget on the way in.	2026-06-02 20:27:24 +09:00
Alexandre Teixeira	f6b0dcbe58	Tests: companion model JSON resilience	2026-06-02 13:15:22 +09:00

1 2 3 4

166 Commits