Allow Gmail quote attribution parsing to handle standard US weekday/month/day/year comma patterns while preserving existing formats, with JS regression coverage.
Make services.search.analytics tolerate missing counters in older or partial analytics files by merging loaded data over defaults, with regression coverage.
Normalize scheduled email send_at values with timezone offsets or Z suffixes to naive UTC before storing, matching the poller's lexicographic comparison format and preventing early/late sends.
Fixes#375
Add setup troubleshooting notes for chromadb-client conflicts, LAN/Tailscale HTTPS exposure, optional dependencies, and clean up chromadb-client in the macOS starter when present.
Fix services.memory bullet-list extraction by grouping the bullet/number regex before the capture, and cover both memory manager copies in the regression test.
The sidebar delete handler fired the DELETE API call without awaiting
it, then called loadSessions() which re-fetches the session list from
the server. If the server hadn't processed the deletion yet, the
session reappeared in the sidebar immediately after being removed.
Await the DELETE response before reloading so the server-side deletion
completes first.
Fixes#1358
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When submitting a message without a model/session configured, the
error path showed a help message but never cleared the textarea,
leaving the user's text stuck in the input field. Clear the input
and trigger autoResize on both the no-default-model and catch paths.
Fixes#1475
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gemma 4 and Phi-4 multimodal are natively vision-capable but their Ollama
tags ("gemma4:12b", "phi-4", "phi4") did not match any keyword in
_VISION_MODEL_KEYWORDS. The image was silently routed to the VL fallback
path instead of being passed directly to the model — users saw the model
respond to a placeholder like "[VL model unavailable - image not analyzed]"
rather than the actual image.
Adds "gemma-4"/"gemma4" and "phi-4"/"phi4" to the keyword list, following
the existing err-toward-True policy (#124): a text-only variant being
treated as vision is the safer failure than dropping a real image.
Fixes#1274 (partial — covers the Gemma 4 + Phi-4 case; the OpenRouter/free
vision fallback path is a separate issue).
POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.
routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
SearchService.search() did:
raw_results = await comprehensive_web_search(
query, max_results=10 * depth, fetch_content=fetch_content)
comprehensive_web_search is a synchronous function whose count knob is
`max_pages` (not `max_results`) and which has no `fetch_content` parameter, so
the call raised TypeError on argument binding; `await` on its non-coroutine
return would also fail. It returns a context string, or a (context, sources)
tuple with return_sources=True — not the list of dicts the wrapper iterates.
The method is exported in services/search/__init__.py and services/__init__.py
with a usage example in its docstring, so any caller of the documented public
API hit an immediate crash. Call it correctly via asyncio.to_thread with
max_pages + return_sources=True and use the returned source list as the rows.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
The fallback helpers (llm_call_with_fallback, llm_call_async_with_fallback,
stream_llm_with_fallback) build their candidate list as the primary target
followed by the configured fallbacks. Callers prepend the session's live
(url, model) to default_model_fallbacks, so if the user also lists their current
model among the fallbacks — a common misconfiguration — the chain re-attempts
the very route that just failed: a wasted round-trip (and, for the streaming
path, a spurious 'fallback' notice for a switch that didn't actually happen).
Add a small _dedupe_candidates() helper that filters malformed entries and drops
a later repeat of an already-seen (url, model), preserving order (first wins,
keeping its headers). Apply it in all three fallback chains.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent tool-RAG force-includes a keyword hint's tools whenever any of its
keywords appears in the query (word-boundary match). The email-intent hint listed
"tell", which matches a huge fraction of requests — e.g. "visit <url> and tell
me the title" — so the whole email toolset was force-included and crowded out the
relevant tools. The model then saw a prompt dominated by email tools and reported
it had no web search / could not visit the URL.
Remove "tell" from the email keyword set. Genuine email intent still fires on
email/mail/gmail/inbox/unread/message/send/reply.
Test drives get_tools_for_query directly with retrieval stubbed (the keyword
hints are deterministic, no embeddings needed): a "...tell me..." web query no
longer pulls in email tools, a real email request still does.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
VectorRAG.search() filters with ChromaDB where={"owner": owner}, returning only
documents whose owner equals the requesting user. The keyword fallback
(_keyword_search_fallback, used when the primary query raises) guarded with
`if doc_owner and doc_owner != owner: continue`, so a document with a
missing/empty owner fell through and was returned to whichever user issued the
query — a cross-user information leak on the fallback path.
Match the primary path's strict filter: skip any doc whose owner != the
requested owner, including owner-less docs.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.
Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
lstrip("\n[PDF content]:") treats the argument as a character set,
not a prefix, so it chews into the following [Page N text]: marker —
e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper
strip_pdf_content_marker (which uses removeprefix) already exists in
the same file and is used by other call sites.
Fixes#1663
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Removing one RAG directory destroyed the whole shared ChromaDB collection
(all owners + base index) instead of just that directory's chunks. Shared
root cause: PersonalDocsManager.remove_directory called rebuild_index()
(delete_collection + recreate) then re-indexed only the remaining tracked
dirs (ownerless, never personal_dir). The targeted VectorRAG.remove_directory
that should have been used was itself broken (where={"source":{"$contains":dir}}
selects nothing on scalar metadata and would over-delete siblings), and the
dead do_manage_rag path fired a second unconditional rebuild.
- VectorRAG.remove_directory: select chunks in Python by a path-boundary match
on the stored absolute `source` (dir or dir+os.sep), abspath-normalized.
Keys on `source` (always written), never `owner` -- no migration.
- PersonalDocsManager.remove_directory: call the targeted remove instead of
rebuild_index() + partial reindex.
- do_manage_rag (dead code): drop the second rebuild_index() (hygiene).
- rag_server.py add path: abspath so indexed `source` matches the remove.
No schema change. Prevents future wipes (does not recover already-wiped
vectors). Adds hermetic regression tests at three layers.
Fixes#1660
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
Three endpoints in history_routes.py ordered by
DbChatMessage.created_at, but the ChatMessage model has no
created_at column — only timestamp. This caused AttributeError
(HTTP 500) on mark-stopped, update-last-meta, and
merge-last-assistant. Other queries in the same file already use
the correct column.
Fixes#1659
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Anthropic's Messages API rejects temperature > 1.0 with HTTP 400, but
_build_anthropic_payload forwarded it verbatim. The shipped "Nietzsche" preset
uses temperature 1.2 and the UI slider allows up to 2.0, so every Claude request
under such a preset hard-broke. Clamp into [0.0, 1.0] in the Anthropic builder
only (OpenAI keeps its wider 0.0-2.0 range). Covers all three Anthropic call
paths, which build through this one function. None is passed through unchanged.
Fixes#1615
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>