Commit Graph

578 Commits

Author SHA1 Message Date
Afonso Coutinho
55c7a4a546 fix: computeSnap throws when ctx.otherLayers is not an array (#1716) 2026-06-03 13:34:25 +09:00
Mubashir R
319ba50a44 fix: validate client-supplied image _endpoint to prevent SSRF (gallery proxies) (#1718)
POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.

routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:34:17 +09:00
Wes Huber
4baf168df0 docs: fix typo in ROADMAP.md (#1719)
"this is ship" → "this ship"

Fixes #1413

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:34:05 +09:00
Mubashir R
535d05c142 fix: SearchService.search() calls comprehensive_web_search incorrectly (broken public API) (#1720)
SearchService.search() did:

    raw_results = await comprehensive_web_search(
        query, max_results=10 * depth, fetch_content=fetch_content)

comprehensive_web_search is a synchronous function whose count knob is
`max_pages` (not `max_results`) and which has no `fetch_content` parameter, so
the call raised TypeError on argument binding; `await` on its non-coroutine
return would also fail. It returns a context string, or a (context, sources)
tuple with return_sources=True — not the list of dicts the wrapper iterates.

The method is exported in services/search/__init__.py and services/__init__.py
with a usage example in its docstring, so any caller of the documented public
API hit an immediate crash. Call it correctly via asyncio.to_thread with
max_pages + return_sources=True and use the returned source list as the rows.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:33:56 +09:00
lekt8
126e91e8b9 Don't attempt the same (url, model) route twice in the fallback chains (#1733)
The fallback helpers (llm_call_with_fallback, llm_call_async_with_fallback,
stream_llm_with_fallback) build their candidate list as the primary target
followed by the configured fallbacks. Callers prepend the session's live
(url, model) to default_model_fallbacks, so if the user also lists their current
model among the fallbacks — a common misconfiguration — the chain re-attempts
the very route that just failed: a wasted round-trip (and, for the streaming
path, a spurious 'fallback' notice for a switch that didn't actually happen).

Add a small _dedupe_candidates() helper that filters malformed entries and drops
a later repeat of an already-seen (url, model), preserving order (first wins,
keeping its headers). Apply it in all three fallback chains.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:50 +09:00
lekt8
77614e9feb Don't force-include the email toolset on every "tell me" query (#1707) (#1735)
The agent tool-RAG force-includes a keyword hint's tools whenever any of its
keywords appears in the query (word-boundary match). The email-intent hint listed
"tell", which matches a huge fraction of requests — e.g. "visit <url> and tell
me the title" — so the whole email toolset was force-included and crowded out the
relevant tools. The model then saw a prompt dominated by email tools and reported
it had no web search / could not visit the URL.

Remove "tell" from the email keyword set. Genuine email intent still fires on
email/mail/gmail/inbox/unread/message/send/reply.

Test drives get_tools_for_query directly with retrieval stubbed (the keyword
hints are deterministic, no embeddings needed): a "...tell me..." web query no
longer pulls in email tools, a real email request still does.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:43 +09:00
Mubashir R
a8a5d6f56e fix: RAG keyword fallback leaked owner-less documents across users (#1722)
VectorRAG.search() filters with ChromaDB where={"owner": owner}, returning only
documents whose owner equals the requesting user. The keyword fallback
(_keyword_search_fallback, used when the primary query raises) guarded with
`if doc_owner and doc_owner != owner: continue`, so a document with a
missing/empty owner fell through and was returned to whichever user issued the
query — a cross-user information leak on the fallback path.

Match the primary path's strict filter: skip any doc whose owner != the
requested owner, including owner-less docs.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:31:33 +09:00
Afonso Coutinho
ada30aa039 fix: evaluate_turn_regex crashes on a non-string agent_reply (#1723) 2026-06-03 13:31:26 +09:00
Afonso Coutinho
290d398900 fix: rewriting a message is lost on reload due to a non-existent DB column (#1729) 2026-06-03 13:31:19 +09:00
Afonso Coutinho
d9e6071528 fix: odysseus-mail read crashes on an empty IMAP fetch payload (#1730) 2026-06-03 13:31:10 +09:00
Afonso Coutinho
c5bc39de88 fix: _extract_entities crashes on a non-string query (#1724) 2026-06-03 13:30:28 +09:00
Afonso Coutinho
0c37943267 fix: search service crashes on a non-dict result row (#1725) 2026-06-03 13:30:19 +09:00
Mubashir R
fefac05ab1 fix: history DB fallback returned hidden (compaction) messages to the client (#1726)
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.

Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:30:11 +09:00
Wes Huber
49885ff9e7 fix(documents): use strip_pdf_content_marker instead of lstrip for PDF auto-open (#1727)
lstrip("\n[PDF content]:") treats the argument as a character set,
not a prefix, so it chews into the following [Page N text]: marker —
e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper
strip_pdf_content_marker (which uses removeprefix) already exists in
the same file and is used by other call sites.

Fixes #1663

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:30:04 +09:00
Mubashir R
4907b16d9b fix: personal-docs path confinement used abspath, allowing symlink escape (#1728)
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:29:57 +09:00
Ethan
0e538ecd29 Fix RAG remove_directory wiping the entire shared collection (#1660) (#1734)
Removing one RAG directory destroyed the whole shared ChromaDB collection
(all owners + base index) instead of just that directory's chunks. Shared
root cause: PersonalDocsManager.remove_directory called rebuild_index()
(delete_collection + recreate) then re-indexed only the remaining tracked
dirs (ownerless, never personal_dir). The targeted VectorRAG.remove_directory
that should have been used was itself broken (where={"source":{"$contains":dir}}
selects nothing on scalar metadata and would over-delete siblings), and the
dead do_manage_rag path fired a second unconditional rebuild.

- VectorRAG.remove_directory: select chunks in Python by a path-boundary match
  on the stored absolute `source` (dir or dir+os.sep), abspath-normalized.
  Keys on `source` (always written), never `owner` -- no migration.
- PersonalDocsManager.remove_directory: call the targeted remove instead of
  rebuild_index() + partial reindex.
- do_manage_rag (dead code): drop the second rebuild_index() (hygiene).
- rag_server.py add path: abspath so indexed `source` matches the remove.

No schema change. Prevents future wipes (does not recover already-wiped
vectors). Adds hermetic regression tests at three layers.

Fixes #1660

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:51 +09:00
Wes Huber
9964e9f3fb fix: use correct column name (timestamp) in history_routes queries (#1736)
Three endpoints in history_routes.py ordered by
DbChatMessage.created_at, but the ChatMessage model has no
created_at column — only timestamp. This caused AttributeError
(HTTP 500) on mark-stopped, update-last-meta, and
merge-last-assistant. Other queries in the same file already use
the correct column.

Fixes #1659

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:29:44 +09:00
Ethan
b9c382006e Clamp Anthropic temperature to [0.0, 1.0] in _build_anthropic_payload (#1737)
Anthropic's Messages API rejects temperature > 1.0 with HTTP 400, but
_build_anthropic_payload forwarded it verbatim. The shipped "Nietzsche" preset
uses temperature 1.2 and the UI slider allows up to 2.0, so every Claude request
under such a preset hard-broke. Clamp into [0.0, 1.0] in the Anthropic builder
only (OpenAI keeps its wider 0.0-2.0 range). Covers all three Anthropic call
paths, which build through this one function. None is passed through unchanged.

Fixes #1615

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:36 +09:00
Afonso Coutinho
96a874c604 fix: a non-dict finding silently drops all raw research findings (#1739) 2026-06-03 13:29:29 +09:00
Afonso Coutinho
7f94c43a45 fix: langIcon throws on an explicit null opts argument (#1740) 2026-06-03 13:29:21 +09:00
Afonso Coutinho
fc8efca49d fix: backup import drops a user's memory when its text matches another user's (#1743) 2026-06-03 13:29:14 +09:00
Afonso Coutinho
063e7114e3 fix: youtube transcript formatter crashes on a non-dict segment (#1745) 2026-06-03 13:29:08 +09:00
Afonso Coutinho
6e38d3f2ef fix: youtube (services) comment formatter crashes on a non-dict comment (#1746) 2026-06-03 13:29:01 +09:00
lekt8
9aa2445ec7 Reconnect after a failed SEARCH ALL so the email poller doesn't desync IMAP (#1613) (#1748)
On a large Gmail mailbox the email-summary poller's SINCE scan often finds
nothing (INTERNALDATE/date-header quirks), so it falls back to SEARCH ALL. That
returns one enormous UID line; the socket read can time out mid-response, and the
exception was swallowed — leaving the unread '* SEARCH 325188 …' bytes on the
socket. The next command (the downstream re-select) then read those leftover
bytes and failed with 'EXAMINE => unexpected response: b'325188 …''.

Extract the fallback into _latest_inbox_fallback_uids(conn, reconnect): on a
failed SEARCH ALL it logs out the poisoned connection and reconnects, returning
the fresh connection for downstream use. Reconnecting is correct by construction
— a new connection cannot carry the old one's leftover bytes — so the re-select
always runs on a clean socket.

The same SEARCH ALL + reuse pattern also exists in mcp_servers/email_server.py
and routes/email_routes.py; left for a separate change to keep this surgical.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:28:53 +09:00
Afonso Coutinho
133948cc78 fix: uploads with _ or - in the extension become permanently unreadable (#1756) 2026-06-03 13:28:45 +09:00
Afonso Coutinho
992866e167 fix: document library language facet undercounts text documents (#1758) 2026-06-03 13:28:38 +09:00
lekt8
a096e872f5 Let orphaned documents be reopened from the library (#1602) (#1761)
After an AI-written document is closed, its session_id is nulled (the detach
behaviour from #1238). Both Open controls in the Documents library — the card's
expanded Open button and the card dropdown's Open item — gated on
`doc.session_id`: they wired `libraryOpenInSession` (which early-returns with no
session) and DISABLED the control otherwise, so the user's own document showed a
grayed-out Open button and couldn't be reopened.

The module already has `libraryOpenDocument`, which explicitly handles the
orphaned case ("just open in editor without switching session" -> _loadDocument
by id). Route the no-session path there instead of disabling.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:28:31 +09:00
ghreprimand
6f001af2a3 Add a 'Rebuild llama.cpp' Cookbook action to force a fresh GPU build (#1787)
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.

Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.

- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
  the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME

Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).

Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 13:28:19 +09:00
Afonso Coutinho
51857c9008 fix: chat memory extraction crashes on a non-dict message (#1749) 2026-06-03 13:25:48 +09:00
Afonso Coutinho
a714915afe fix: _derive_title crashes on non-string content instead of returning Untitled (#1751) 2026-06-03 13:25:41 +09:00
clockworksquirrel
2625e97f11 Stop conversations crashing during compaction on tool-call turns (#1777)
context_compactor.maybe_compact built its summary text with
msg.get('content', '')[:2000], which raised
TypeError: 'NoneType' object is not subscriptable on assistant turns
whose content is None (turns that carried only native tool_calls).
Once a conversation crossed the 85% compaction threshold — reached
after only a few turns on small-context local models plus the large
agent prompt — every subsequent message failed ("send more than three
messages and it stops working").

Flatten message content to text first via a _content_as_text helper
(str passthrough, multimodal list blocks joined, None -> "") and
tolerate a missing role. Adds tests/test_context_compactor.py covering
the helper and a >=4-message conversation that forces compaction with
a None-content tool-call turn (fails before this change, passes after).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:25:33 +09:00
ooovenenoso
12696a05ae fix(markdown): keep allowed-html placeholders out of fenced code (#1788) 2026-06-03 13:25:26 +09:00
Afonso Coutinho
2fa4d50115 fix: is_youtube_url crashes on a non-string url (#1752) 2026-06-03 13:24:33 +09:00
Afonso Coutinho
d2f6e8068d fix: is_youtube_url (services) crashes on a non-string url (#1753) 2026-06-03 13:24:24 +09:00
Ethan
33bf975597 Stop GET /api/search/config from leaking the Brave API key (#1661) (#1750)
get_search_config returned SEARCH_CONFIG.copy(), and update_search_config
cached the decrypted Brave key into that shared global at startup
(app_initializer), so the unauthenticated /api/search/config route exposed
the operator's key. The cache was dead weight: brave_search reads its key
via _get_provider_key (settings/env), never SEARCH_CONFIG.

- update_search_config: no longer stores the api_key in the shared global
  (accepted for backward compat; provider keys are read on demand).
- get_search_config: scrub any string-valued credential field before
  returning, preserving the has_api_key presence flag.

No schema change; brave_search/_get_provider_key untouched. Adds regression
tests.

Fixes #1661

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:24:17 +09:00
Wes Huber
3abb735200 fix(security): scope send_to_session agent tool by owner (#1757)
send_to_session was the only agent tool that didn't check session
ownership — an agent acting for user A could read from and write
into user B's session on a multi-user instance.

Add owner parameter and reject access when the target session
belongs to a different user, matching the pattern used by
create_session, list_sessions, and manage_session.

Fixes #1616

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:24:08 +09:00
Afonso Coutinho
b3da01efd5 fix: ui_control rejects the advertised rag toggle (#1763) 2026-06-03 13:24:00 +09:00
Afonso Coutinho
c4fcebd9c7 fix: disabling auth wipes all users' preferences on next pref save (#1764) 2026-06-03 13:23:50 +09:00
Lucas Daniel
68da800dcb fix(agent): stop sending tool schemas to native Ollama endpoints (#1765)
Models like gemma4, qwen3.5, and ministral served via Ollama's native
/api/chat respond to OpenAI-style tool schemas by emitting a single
native tool_call chunk and then stopping. The agent loop receives
1 token of round_response and no recognised ToolBlock, so the round
ends immediately — the user sees a one-token response.

Root cause: _is_api_model was True for any endpoint whose host appears
in _API_HOSTS (which includes "host.docker.internal" and "localhost")
OR whose model name matches a keyword like "gemma". Native Ollama
endpoints were never excluded from this path.

Fix: import _is_ollama_native_url from llm_core and treat native Ollama
endpoints (/api/chat, port 11434) as text-only by default — falling back
to the fenced-block tool path the local models are tuned for. The
per-endpoint supports_tools=True toggle (Settings → Endpoints) still
overrides this for users who have explicitly opted in.

Fixes #1567
2026-06-03 13:23:42 +09:00
lekt8
bf2a1365f6 Don't falsely declare a dependency build stale (#1568) (#1768)
Installing a heavy dependency like vllm crashes in a "stale — restarting" loop:
it restarts mid-install, reuses the cached wheels, then stalls again.

The download/install watchdog (cookbookRunning.js) keyed its stall signal purely
off the downloaded-byte counter ("1.81G/2.49G"). A dependency install spends long
stretches with NO byte counter — pip dependency resolution and the native CUDA
build/compile — so the signal froze and after STALE_PROGRESS_MS the watchdog
declared it stale and auto-restarted it mid-build, looping forever.

Extract the signal into a pure computeProgressSignal (cookbookProgressSignal.js):
keep the byte counter for the download phase (so a genuinely stuck download is
still caught, and an animating-but-frozen ETA frame is NOT mistaken for progress),
and when there's no byte counter fall back to a fingerprint of the output tail so
resolver/compile lines count as progress. Only a truly frozen tail now reads as
stalled.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:23:35 +09:00
Shatti2
4ca3b38667 fix(calendar): negotiate Digest auth in CalDAV test endpoint (#1767)
POST /api/calendar/test issues a single PROPFIND with raw httpx
Basic auth. CalDAV servers configured for Digest (Baïkal default,
SabreDAV-based servers, Radicale with htdigest) reject Basic with
401, so the UI "Test connection" button surfaces "Auth failed —
check username/password" even when the URL and credentials are
correct.

src/caldav_sync.py (the real sync path) uses caldav.DAVClient,
which negotiates the scheme via niquests, so production sync
already works against these servers. The test endpoint just
doesn't match. Bring it to parity: keep the cheap Basic first
attempt, and on a 401-with-Digest-challenge retry once with
httpx.DigestAuth before deciding it's an auth failure.

Repro: configure CalDAV against a stock Baïkal install — test
button returns 401, sync succeeds.

Co-authored-by: Shatti2 <codered5678@gmail.com>
2026-06-03 13:23:28 +09:00
Alexandre Teixeira
ea1079e1df docs: fix stale documentation references (#1769) 2026-06-03 13:23:21 +09:00
Lucas Daniel
12fd8b6570 fix(group): show all user-created personas in the participant selector (#1770)
_getCharacterList() had two bugs that silently dropped every
user-created persona from the group participant picker:

1. The /api/presets/templates endpoint returns a JSON array directly,
   but the code read `data.templates` (always undefined). The forEach
   over `data.templates || []` iterated over an empty array every time,
   so no user templates were ever added.

2. Even if the array had been read correctly, the `t.isCharacter` guard
   would have filtered them all out — user templates are saved by
   presets.js without that flag, which is only present on built-in
   PROMPT_TEMPLATES entries.

Fix: accept both the direct-array and the {templates:[]} shapes, drop
the isCharacter guard (user_templates are personas by definition), and
use the correct field name (system_prompt, not prompt) so the character
prompt actually reaches the group chat.

Fixes #1656
2026-06-03 13:23:14 +09:00
Afonso Coutinho
bde5f6adb3 fix: gallery tag filters and tag-cleanup are empty in single-user mode (#1771) 2026-06-03 13:23:08 +09:00
Afonso Coutinho
d25a860f71 fix: document tidy crashes on a duplicate with NULL timestamps (#1772) 2026-06-03 13:23:01 +09:00
Afonso Coutinho
db1596f3b4 fix: signature learning never skips support@/info@/admin@ senders (#1773) 2026-06-03 13:22:52 +09:00
Afonso Coutinho
694647375c fix: signature delimiter fold misses self-closing <br/> breaks (#1774) 2026-06-03 13:22:46 +09:00
Lucas Daniel
1d99429ba0 fix(cookbook): prevent auto-retry from restarting user-stopped downloads (#1778)
Two related bugs in the Cookbook task lifecycle:

1. "Stop all" fired kills via .click() inside a synchronous forEach but
   showed the success toast immediately after — the toast appeared before
   any of the async kill requests had been sent, giving the user false
   confidence the tasks were stopped.

2. The download auto-retry logic (triggered when DOWNLOAD_FAILED appears
   in the task output) had no way to distinguish a network interruption
   from a deliberate user stop. A download stopped via "Stop all" or the
   individual Stop button could be silently restarted up to two times by
   the background monitor.

Fix: persist _userStopped: true to localStorage at the moment the user
clicks Stop (individually) or Stop all. The auto-retry guard checks this
flag before relaunching the download. The flag is written BEFORE the
kill requests fire so there is no window where the monitor can race.

Fixes #1458
2026-06-03 13:22:39 +09:00
pewdiepie-archdaemon
ed7956cbd3 Owner-scope RAG doc ids so identical chunks across users don't collide (#1738, #1760)
_generate_doc_id hashed only text. add_document / add_documents_batch
early-return when the id exists, so the second owner indexing a
byte-identical chunk hit the first owner's id, was silently dropped,
and never stored under their owner — their owner-filtered search then
quietly omitted it. Hash owner + text; empty owner reproduces the
legacy id, so the unowned/base index keeps existing ids and isn't
re-churned. Same-owner identical chunks still dedupe.

Caught by #1738 and #1760 (independent reports of the same bug).
2026-06-03 11:36:31 +09:00
pewdiepie-archdaemon
8e2b9baf19 Rebuild memory vector index from the full saved set, not just the audited owner (#1747)
audit_memories saves final_entries merged with other owners' entries
(correct), but then rebuilt the shared vector collection from
final_entries alone — wiping every other owner from semantic search
until they happened to run their own audit. Keyword fallback masked
it, so it degraded silently. Capture saved_entries once and rebuild
from that.

Caught by #1747.
2026-06-03 11:36:24 +09:00