POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.
routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
SearchService.search() did:
raw_results = await comprehensive_web_search(
query, max_results=10 * depth, fetch_content=fetch_content)
comprehensive_web_search is a synchronous function whose count knob is
`max_pages` (not `max_results`) and which has no `fetch_content` parameter, so
the call raised TypeError on argument binding; `await` on its non-coroutine
return would also fail. It returns a context string, or a (context, sources)
tuple with return_sources=True — not the list of dicts the wrapper iterates.
The method is exported in services/search/__init__.py and services/__init__.py
with a usage example in its docstring, so any caller of the documented public
API hit an immediate crash. Call it correctly via asyncio.to_thread with
max_pages + return_sources=True and use the returned source list as the rows.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
The fallback helpers (llm_call_with_fallback, llm_call_async_with_fallback,
stream_llm_with_fallback) build their candidate list as the primary target
followed by the configured fallbacks. Callers prepend the session's live
(url, model) to default_model_fallbacks, so if the user also lists their current
model among the fallbacks — a common misconfiguration — the chain re-attempts
the very route that just failed: a wasted round-trip (and, for the streaming
path, a spurious 'fallback' notice for a switch that didn't actually happen).
Add a small _dedupe_candidates() helper that filters malformed entries and drops
a later repeat of an already-seen (url, model), preserving order (first wins,
keeping its headers). Apply it in all three fallback chains.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent tool-RAG force-includes a keyword hint's tools whenever any of its
keywords appears in the query (word-boundary match). The email-intent hint listed
"tell", which matches a huge fraction of requests — e.g. "visit <url> and tell
me the title" — so the whole email toolset was force-included and crowded out the
relevant tools. The model then saw a prompt dominated by email tools and reported
it had no web search / could not visit the URL.
Remove "tell" from the email keyword set. Genuine email intent still fires on
email/mail/gmail/inbox/unread/message/send/reply.
Test drives get_tools_for_query directly with retrieval stubbed (the keyword
hints are deterministic, no embeddings needed): a "...tell me..." web query no
longer pulls in email tools, a real email request still does.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
VectorRAG.search() filters with ChromaDB where={"owner": owner}, returning only
documents whose owner equals the requesting user. The keyword fallback
(_keyword_search_fallback, used when the primary query raises) guarded with
`if doc_owner and doc_owner != owner: continue`, so a document with a
missing/empty owner fell through and was returned to whichever user issued the
query — a cross-user information leak on the fallback path.
Match the primary path's strict filter: skip any doc whose owner != the
requested owner, including owner-less docs.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.
Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Removing one RAG directory destroyed the whole shared ChromaDB collection
(all owners + base index) instead of just that directory's chunks. Shared
root cause: PersonalDocsManager.remove_directory called rebuild_index()
(delete_collection + recreate) then re-indexed only the remaining tracked
dirs (ownerless, never personal_dir). The targeted VectorRAG.remove_directory
that should have been used was itself broken (where={"source":{"$contains":dir}}
selects nothing on scalar metadata and would over-delete siblings), and the
dead do_manage_rag path fired a second unconditional rebuild.
- VectorRAG.remove_directory: select chunks in Python by a path-boundary match
on the stored absolute `source` (dir or dir+os.sep), abspath-normalized.
Keys on `source` (always written), never `owner` -- no migration.
- PersonalDocsManager.remove_directory: call the targeted remove instead of
rebuild_index() + partial reindex.
- do_manage_rag (dead code): drop the second rebuild_index() (hygiene).
- rag_server.py add path: abspath so indexed `source` matches the remove.
No schema change. Prevents future wipes (does not recover already-wiped
vectors). Adds hermetic regression tests at three layers.
Fixes#1660
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
Anthropic's Messages API rejects temperature > 1.0 with HTTP 400, but
_build_anthropic_payload forwarded it verbatim. The shipped "Nietzsche" preset
uses temperature 1.2 and the UI slider allows up to 2.0, so every Claude request
under such a preset hard-broke. Clamp into [0.0, 1.0] in the Anthropic builder
only (OpenAI keeps its wider 0.0-2.0 range). Covers all three Anthropic call
paths, which build through this one function. None is passed through unchanged.
Fixes#1615
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
On a large Gmail mailbox the email-summary poller's SINCE scan often finds
nothing (INTERNALDATE/date-header quirks), so it falls back to SEARCH ALL. That
returns one enormous UID line; the socket read can time out mid-response, and the
exception was swallowed — leaving the unread '* SEARCH 325188 …' bytes on the
socket. The next command (the downstream re-select) then read those leftover
bytes and failed with 'EXAMINE => unexpected response: b'325188 …''.
Extract the fallback into _latest_inbox_fallback_uids(conn, reconnect): on a
failed SEARCH ALL it logs out the poisoned connection and reconnects, returning
the fresh connection for downstream use. Reconnecting is correct by construction
— a new connection cannot carry the old one's leftover bytes — so the re-select
always runs on a clean socket.
The same SEARCH ALL + reuse pattern also exists in mcp_servers/email_server.py
and routes/email_routes.py; left for a separate change to keep this surgical.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After an AI-written document is closed, its session_id is nulled (the detach
behaviour from #1238). Both Open controls in the Documents library — the card's
expanded Open button and the card dropdown's Open item — gated on
`doc.session_id`: they wired `libraryOpenInSession` (which early-returns with no
session) and DISABLED the control otherwise, so the user's own document showed a
grayed-out Open button and couldn't be reopened.
The module already has `libraryOpenDocument`, which explicitly handles the
orphaned case ("just open in editor without switching session" -> _loadDocument
by id). Route the no-session path there instead of disabling.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.
Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.
- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME
Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
context_compactor.maybe_compact built its summary text with
msg.get('content', '')[:2000], which raised
TypeError: 'NoneType' object is not subscriptable on assistant turns
whose content is None (turns that carried only native tool_calls).
Once a conversation crossed the 85% compaction threshold — reached
after only a few turns on small-context local models plus the large
agent prompt — every subsequent message failed ("send more than three
messages and it stops working").
Flatten message content to text first via a _content_as_text helper
(str passthrough, multimodal list blocks joined, None -> "") and
tolerate a missing role. Adds tests/test_context_compactor.py covering
the helper and a >=4-message conversation that forces compaction with
a None-content tool-call turn (fails before this change, passes after).
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
get_search_config returned SEARCH_CONFIG.copy(), and update_search_config
cached the decrypted Brave key into that shared global at startup
(app_initializer), so the unauthenticated /api/search/config route exposed
the operator's key. The cache was dead weight: brave_search reads its key
via _get_provider_key (settings/env), never SEARCH_CONFIG.
- update_search_config: no longer stores the api_key in the shared global
(accepted for backward compat; provider keys are read on demand).
- get_search_config: scrub any string-valued credential field before
returning, preserving the has_api_key presence flag.
No schema change; brave_search/_get_provider_key untouched. Adds regression
tests.
Fixes#1661
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
Installing a heavy dependency like vllm crashes in a "stale — restarting" loop:
it restarts mid-install, reuses the cached wheels, then stalls again.
The download/install watchdog (cookbookRunning.js) keyed its stall signal purely
off the downloaded-byte counter ("1.81G/2.49G"). A dependency install spends long
stretches with NO byte counter — pip dependency resolution and the native CUDA
build/compile — so the signal froze and after STALE_PROGRESS_MS the watchdog
declared it stale and auto-restarted it mid-build, looping forever.
Extract the signal into a pure computeProgressSignal (cookbookProgressSignal.js):
keep the byte counter for the download phase (so a genuinely stuck download is
still caught, and an animating-but-frozen ETA frame is NOT mistaken for progress),
and when there's no byte counter fall back to a fingerprint of the output tail so
resolver/compile lines count as progress. Only a truly frozen tail now reads as
stalled.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>