Commit Graph

439 Commits

Author SHA1 Message Date
Afonso Coutinho
13f0171ce8 fix: extract_youtube_id crashes on a non-string url instead of returning None (#1689) 2026-06-03 13:38:11 +09:00
Afonso Coutinho
35b9509da3 fix: memory entry validation crashes on a non-dict row from memory.json (#1691) 2026-06-03 13:38:02 +09:00
Afonso Coutinho
f0b172020e fix: require_privilege 500s on a non-dict privileges blob from auth.json (#1693) 2026-06-03 13:37:54 +09:00
Rolly Calma
933c461f38 fix: use running loop for shell stream deadlines (#1694) 2026-06-03 13:37:46 +09:00
Afonso Coutinho
02ff2e3cb0 fix: updating a calendar event ignores user timezone and shifts the time (#1695) 2026-06-03 13:37:39 +09:00
Afonso Coutinho
667b739af4 fix: reply-all Cc builder crashes on a non-string To or Cc field (#1700) 2026-06-03 13:37:22 +09:00
Afonso Coutinho
19e62208d2 fix: streaming drops providers that emit SSE data lines with no space (#1701) 2026-06-03 13:37:14 +09:00
Afonso Coutinho
3da4edb442 fix: token usage dropped when it rides on a non-empty finish delta (#1703) 2026-06-03 13:36:57 +09:00
Afonso Coutinho
9dd9bb8a3f fix: memory recall crashes on a non-dict row from the vector store (#1705) 2026-06-03 13:35:09 +09:00
Afonso Coutinho
86d3af743a fix: docs RAG query crashes on a non-dict row from the index (#1706) 2026-06-03 13:35:01 +09:00
Afonso Coutinho
076607c9b9 fix: archive browser model filter is suffix-only and drops matching models (#1709) 2026-06-03 13:34:54 +09:00
Afonso Coutinho
56123e052b fix: compacting a chat with image attachments destroys the attachment (#1710) 2026-06-03 13:34:47 +09:00
Afonso Coutinho
f6f86c4b34 fix: research source extraction crashes on a non-dict finding (#1714) 2026-06-03 13:34:40 +09:00
Afonso Coutinho
29e19f326a fix: _resolve_user_upload_path crashes on a non-dict resolve_upload result (#1715) 2026-06-03 13:34:33 +09:00
Afonso Coutinho
55c7a4a546 fix: computeSnap throws when ctx.otherLayers is not an array (#1716) 2026-06-03 13:34:25 +09:00
Mubashir R
319ba50a44 fix: validate client-supplied image _endpoint to prevent SSRF (gallery proxies) (#1718)
POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.

routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:34:17 +09:00
Mubashir R
535d05c142 fix: SearchService.search() calls comprehensive_web_search incorrectly (broken public API) (#1720)
SearchService.search() did:

    raw_results = await comprehensive_web_search(
        query, max_results=10 * depth, fetch_content=fetch_content)

comprehensive_web_search is a synchronous function whose count knob is
`max_pages` (not `max_results`) and which has no `fetch_content` parameter, so
the call raised TypeError on argument binding; `await` on its non-coroutine
return would also fail. It returns a context string, or a (context, sources)
tuple with return_sources=True — not the list of dicts the wrapper iterates.

The method is exported in services/search/__init__.py and services/__init__.py
with a usage example in its docstring, so any caller of the documented public
API hit an immediate crash. Call it correctly via asyncio.to_thread with
max_pages + return_sources=True and use the returned source list as the rows.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:33:56 +09:00
lekt8
126e91e8b9 Don't attempt the same (url, model) route twice in the fallback chains (#1733)
The fallback helpers (llm_call_with_fallback, llm_call_async_with_fallback,
stream_llm_with_fallback) build their candidate list as the primary target
followed by the configured fallbacks. Callers prepend the session's live
(url, model) to default_model_fallbacks, so if the user also lists their current
model among the fallbacks — a common misconfiguration — the chain re-attempts
the very route that just failed: a wasted round-trip (and, for the streaming
path, a spurious 'fallback' notice for a switch that didn't actually happen).

Add a small _dedupe_candidates() helper that filters malformed entries and drops
a later repeat of an already-seen (url, model), preserving order (first wins,
keeping its headers). Apply it in all three fallback chains.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:50 +09:00
lekt8
77614e9feb Don't force-include the email toolset on every "tell me" query (#1707) (#1735)
The agent tool-RAG force-includes a keyword hint's tools whenever any of its
keywords appears in the query (word-boundary match). The email-intent hint listed
"tell", which matches a huge fraction of requests — e.g. "visit <url> and tell
me the title" — so the whole email toolset was force-included and crowded out the
relevant tools. The model then saw a prompt dominated by email tools and reported
it had no web search / could not visit the URL.

Remove "tell" from the email keyword set. Genuine email intent still fires on
email/mail/gmail/inbox/unread/message/send/reply.

Test drives get_tools_for_query directly with retrieval stubbed (the keyword
hints are deterministic, no embeddings needed): a "...tell me..." web query no
longer pulls in email tools, a real email request still does.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:43 +09:00
Mubashir R
a8a5d6f56e fix: RAG keyword fallback leaked owner-less documents across users (#1722)
VectorRAG.search() filters with ChromaDB where={"owner": owner}, returning only
documents whose owner equals the requesting user. The keyword fallback
(_keyword_search_fallback, used when the primary query raises) guarded with
`if doc_owner and doc_owner != owner: continue`, so a document with a
missing/empty owner fell through and was returned to whichever user issued the
query — a cross-user information leak on the fallback path.

Match the primary path's strict filter: skip any doc whose owner != the
requested owner, including owner-less docs.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:31:33 +09:00
Afonso Coutinho
ada30aa039 fix: evaluate_turn_regex crashes on a non-string agent_reply (#1723) 2026-06-03 13:31:26 +09:00
Afonso Coutinho
290d398900 fix: rewriting a message is lost on reload due to a non-existent DB column (#1729) 2026-06-03 13:31:19 +09:00
Afonso Coutinho
d9e6071528 fix: odysseus-mail read crashes on an empty IMAP fetch payload (#1730) 2026-06-03 13:31:10 +09:00
Afonso Coutinho
c5bc39de88 fix: _extract_entities crashes on a non-string query (#1724) 2026-06-03 13:30:28 +09:00
Afonso Coutinho
0c37943267 fix: search service crashes on a non-dict result row (#1725) 2026-06-03 13:30:19 +09:00
Mubashir R
fefac05ab1 fix: history DB fallback returned hidden (compaction) messages to the client (#1726)
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.

Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:30:11 +09:00
Mubashir R
4907b16d9b fix: personal-docs path confinement used abspath, allowing symlink escape (#1728)
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:29:57 +09:00
Ethan
0e538ecd29 Fix RAG remove_directory wiping the entire shared collection (#1660) (#1734)
Removing one RAG directory destroyed the whole shared ChromaDB collection
(all owners + base index) instead of just that directory's chunks. Shared
root cause: PersonalDocsManager.remove_directory called rebuild_index()
(delete_collection + recreate) then re-indexed only the remaining tracked
dirs (ownerless, never personal_dir). The targeted VectorRAG.remove_directory
that should have been used was itself broken (where={"source":{"$contains":dir}}
selects nothing on scalar metadata and would over-delete siblings), and the
dead do_manage_rag path fired a second unconditional rebuild.

- VectorRAG.remove_directory: select chunks in Python by a path-boundary match
  on the stored absolute `source` (dir or dir+os.sep), abspath-normalized.
  Keys on `source` (always written), never `owner` -- no migration.
- PersonalDocsManager.remove_directory: call the targeted remove instead of
  rebuild_index() + partial reindex.
- do_manage_rag (dead code): drop the second rebuild_index() (hygiene).
- rag_server.py add path: abspath so indexed `source` matches the remove.

No schema change. Prevents future wipes (does not recover already-wiped
vectors). Adds hermetic regression tests at three layers.

Fixes #1660

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:51 +09:00
Ethan
b9c382006e Clamp Anthropic temperature to [0.0, 1.0] in _build_anthropic_payload (#1737)
Anthropic's Messages API rejects temperature > 1.0 with HTTP 400, but
_build_anthropic_payload forwarded it verbatim. The shipped "Nietzsche" preset
uses temperature 1.2 and the UI slider allows up to 2.0, so every Claude request
under such a preset hard-broke. Clamp into [0.0, 1.0] in the Anthropic builder
only (OpenAI keeps its wider 0.0-2.0 range). Covers all three Anthropic call
paths, which build through this one function. None is passed through unchanged.

Fixes #1615

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:36 +09:00
Afonso Coutinho
96a874c604 fix: a non-dict finding silently drops all raw research findings (#1739) 2026-06-03 13:29:29 +09:00
Afonso Coutinho
7f94c43a45 fix: langIcon throws on an explicit null opts argument (#1740) 2026-06-03 13:29:21 +09:00
Afonso Coutinho
fc8efca49d fix: backup import drops a user's memory when its text matches another user's (#1743) 2026-06-03 13:29:14 +09:00
Afonso Coutinho
063e7114e3 fix: youtube transcript formatter crashes on a non-dict segment (#1745) 2026-06-03 13:29:08 +09:00
Afonso Coutinho
6e38d3f2ef fix: youtube (services) comment formatter crashes on a non-dict comment (#1746) 2026-06-03 13:29:01 +09:00
lekt8
9aa2445ec7 Reconnect after a failed SEARCH ALL so the email poller doesn't desync IMAP (#1613) (#1748)
On a large Gmail mailbox the email-summary poller's SINCE scan often finds
nothing (INTERNALDATE/date-header quirks), so it falls back to SEARCH ALL. That
returns one enormous UID line; the socket read can time out mid-response, and the
exception was swallowed — leaving the unread '* SEARCH 325188 …' bytes on the
socket. The next command (the downstream re-select) then read those leftover
bytes and failed with 'EXAMINE => unexpected response: b'325188 …''.

Extract the fallback into _latest_inbox_fallback_uids(conn, reconnect): on a
failed SEARCH ALL it logs out the poisoned connection and reconnects, returning
the fresh connection for downstream use. Reconnecting is correct by construction
— a new connection cannot carry the old one's leftover bytes — so the re-select
always runs on a clean socket.

The same SEARCH ALL + reuse pattern also exists in mcp_servers/email_server.py
and routes/email_routes.py; left for a separate change to keep this surgical.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:28:53 +09:00
Afonso Coutinho
133948cc78 fix: uploads with _ or - in the extension become permanently unreadable (#1756) 2026-06-03 13:28:45 +09:00
Afonso Coutinho
992866e167 fix: document library language facet undercounts text documents (#1758) 2026-06-03 13:28:38 +09:00
lekt8
a096e872f5 Let orphaned documents be reopened from the library (#1602) (#1761)
After an AI-written document is closed, its session_id is nulled (the detach
behaviour from #1238). Both Open controls in the Documents library — the card's
expanded Open button and the card dropdown's Open item — gated on
`doc.session_id`: they wired `libraryOpenInSession` (which early-returns with no
session) and DISABLED the control otherwise, so the user's own document showed a
grayed-out Open button and couldn't be reopened.

The module already has `libraryOpenDocument`, which explicitly handles the
orphaned case ("just open in editor without switching session" -> _loadDocument
by id). Route the no-session path there instead of disabling.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:28:31 +09:00
ghreprimand
6f001af2a3 Add a 'Rebuild llama.cpp' Cookbook action to force a fresh GPU build (#1787)
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.

Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.

- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
  the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME

Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).

Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 13:28:19 +09:00
Afonso Coutinho
51857c9008 fix: chat memory extraction crashes on a non-dict message (#1749) 2026-06-03 13:25:48 +09:00
Afonso Coutinho
a714915afe fix: _derive_title crashes on non-string content instead of returning Untitled (#1751) 2026-06-03 13:25:41 +09:00
clockworksquirrel
2625e97f11 Stop conversations crashing during compaction on tool-call turns (#1777)
context_compactor.maybe_compact built its summary text with
msg.get('content', '')[:2000], which raised
TypeError: 'NoneType' object is not subscriptable on assistant turns
whose content is None (turns that carried only native tool_calls).
Once a conversation crossed the 85% compaction threshold — reached
after only a few turns on small-context local models plus the large
agent prompt — every subsequent message failed ("send more than three
messages and it stops working").

Flatten message content to text first via a _content_as_text helper
(str passthrough, multimodal list blocks joined, None -> "") and
tolerate a missing role. Adds tests/test_context_compactor.py covering
the helper and a >=4-message conversation that forces compaction with
a None-content tool-call turn (fails before this change, passes after).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:25:33 +09:00
ooovenenoso
12696a05ae fix(markdown): keep allowed-html placeholders out of fenced code (#1788) 2026-06-03 13:25:26 +09:00
Afonso Coutinho
2fa4d50115 fix: is_youtube_url crashes on a non-string url (#1752) 2026-06-03 13:24:33 +09:00
Afonso Coutinho
d2f6e8068d fix: is_youtube_url (services) crashes on a non-string url (#1753) 2026-06-03 13:24:24 +09:00
Ethan
33bf975597 Stop GET /api/search/config from leaking the Brave API key (#1661) (#1750)
get_search_config returned SEARCH_CONFIG.copy(), and update_search_config
cached the decrypted Brave key into that shared global at startup
(app_initializer), so the unauthenticated /api/search/config route exposed
the operator's key. The cache was dead weight: brave_search reads its key
via _get_provider_key (settings/env), never SEARCH_CONFIG.

- update_search_config: no longer stores the api_key in the shared global
  (accepted for backward compat; provider keys are read on demand).
- get_search_config: scrub any string-valued credential field before
  returning, preserving the has_api_key presence flag.

No schema change; brave_search/_get_provider_key untouched. Adds regression
tests.

Fixes #1661

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:24:17 +09:00
Afonso Coutinho
b3da01efd5 fix: ui_control rejects the advertised rag toggle (#1763) 2026-06-03 13:24:00 +09:00
Afonso Coutinho
c4fcebd9c7 fix: disabling auth wipes all users' preferences on next pref save (#1764) 2026-06-03 13:23:50 +09:00
lekt8
bf2a1365f6 Don't falsely declare a dependency build stale (#1568) (#1768)
Installing a heavy dependency like vllm crashes in a "stale — restarting" loop:
it restarts mid-install, reuses the cached wheels, then stalls again.

The download/install watchdog (cookbookRunning.js) keyed its stall signal purely
off the downloaded-byte counter ("1.81G/2.49G"). A dependency install spends long
stretches with NO byte counter — pip dependency resolution and the native CUDA
build/compile — so the signal froze and after STALE_PROGRESS_MS the watchdog
declared it stale and auto-restarted it mid-build, looping forever.

Extract the signal into a pure computeProgressSignal (cookbookProgressSignal.js):
keep the byte counter for the download phase (so a genuinely stuck download is
still caught, and an animating-but-frozen ETA frame is NOT mistaken for progress),
and when there's no byte counter fall back to a fingerprint of the output tail so
resolver/compile lines count as progress. Only a truly frozen tail now reads as
stalled.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:23:35 +09:00
Afonso Coutinho
bde5f6adb3 fix: gallery tag filters and tag-cleanup are empty in single-user mode (#1771) 2026-06-03 13:23:08 +09:00