Commit Graph

606 Commits

Author SHA1 Message Date
red person
8af1f85665 Ignore non-string email thread bodies (#1654) 2026-06-03 14:06:31 +09:00
Afonso Coutinho
a54d34149a Parse standard Gmail quote attribution dates
Allow Gmail quote attribution parsing to handle standard US weekday/month/day/year comma patterns while preserving existing formats, with JS regression coverage.
2026-06-03 13:45:56 +09:00
Afonso Coutinho
46999debdb Decode email headers without injected spaces
Use email.header.make_header for MIME header decoding so adjacent encoded/plain header parts preserve RFC spacing, with regression coverage.
2026-06-03 13:45:33 +09:00
Afonso Coutinho
f29c827e6e Merge search analytics defaults in services copy
Make services.search.analytics tolerate missing counters in older or partial analytics files by merging loaded data over defaults, with regression coverage.
2026-06-03 13:45:07 +09:00
Afonso Coutinho
10e797a1aa Normalize scheduled email offsets before storage
Normalize scheduled email send_at values with timezone offsets or Z suffixes to naive UTC before storing, matching the poller's lexicographic comparison format and preventing early/late sends.
2026-06-03 13:44:18 +09:00
Sid
2ef496f622 Document setup troubleshooting and ChromaDB conflict
Fixes #375

Add setup troubleshooting notes for chromadb-client conflicts, LAN/Tailscale HTTPS exposure, optional dependencies, and clean up chromadb-client in the macOS starter when present.
2026-06-03 13:43:47 +09:00
Wes Huber
7d76fca21c Replace deprecated FastAPI on_event hooks with lifespan
Fixes #1448

Move startup and shutdown logic behind a FastAPI lifespan context while preserving the existing lifecycle bodies.
2026-06-03 13:43:14 +09:00
Afonso Coutinho
28dbd5346c Treat non-string research summaries as low quality
Filter malformed non-string research summaries instead of letting the broad exception path classify them as usable, with regression coverage.
2026-06-03 13:42:24 +09:00
Afonso Coutinho
a880b17624 Skip malformed personal keyword index rows
Make personal keyword retrieval tolerate corrupted non-dict index entries and missing chunk lists, with regression coverage.
2026-06-03 13:42:05 +09:00
Mubashir R
61d62a3cb8 Fix memory bullet extraction in service copy
Fix services.memory bullet-list extraction by grouping the bullet/number regex before the capture, and cover both memory manager copies in the regression test.
2026-06-03 13:41:46 +09:00
Marius Popa
4ec53a296a Fix document editor scrollbar and line-number sync
Fixes #1501
Fixes #1496
2026-06-03 13:40:19 +09:00
Afonso Coutinho
13f0171ce8 fix: extract_youtube_id crashes on a non-string url instead of returning None (#1689) 2026-06-03 13:38:11 +09:00
Afonso Coutinho
35b9509da3 fix: memory entry validation crashes on a non-dict row from memory.json (#1691) 2026-06-03 13:38:02 +09:00
Afonso Coutinho
f0b172020e fix: require_privilege 500s on a non-dict privileges blob from auth.json (#1693) 2026-06-03 13:37:54 +09:00
Rolly Calma
933c461f38 fix: use running loop for shell stream deadlines (#1694) 2026-06-03 13:37:46 +09:00
Afonso Coutinho
02ff2e3cb0 fix: updating a calendar event ignores user timezone and shifts the time (#1695) 2026-06-03 13:37:39 +09:00
Wes Huber
a72ccf6484 fix(sessions): await DELETE before reloading sidebar session list (#1699)
The sidebar delete handler fired the DELETE API call without awaiting
it, then called loadSessions() which re-fetches the session list from
the server. If the server hadn't processed the deletion yet, the
session reappeared in the sidebar immediately after being removed.

Await the DELETE response before reloading so the server-side deletion
completes first.

Fixes #1358

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:37:29 +09:00
Afonso Coutinho
667b739af4 fix: reply-all Cc builder crashes on a non-string To or Cc field (#1700) 2026-06-03 13:37:22 +09:00
Afonso Coutinho
19e62208d2 fix: streaming drops providers that emit SSE data lines with no space (#1701) 2026-06-03 13:37:14 +09:00
Wes Huber
2e34bde07a fix(chat): clear input field when no model is selected (#1702)
When submitting a message without a model/session configured, the
error path showed a help message but never cleared the textarea,
leaving the user's text stuck in the input field. Clear the input
and trigger autoResize on both the no-default-model and catch paths.

Fixes #1475

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:37:06 +09:00
Afonso Coutinho
3da4edb442 fix: token usage dropped when it rides on a non-empty finish delta (#1703) 2026-06-03 13:36:57 +09:00
Lucas Daniel
578f56ab92 fix(vision): recognize Gemma 4 and Phi-4 as vision-capable models (#1704)
Gemma 4 and Phi-4 multimodal are natively vision-capable but their Ollama
tags ("gemma4:12b", "phi-4", "phi4") did not match any keyword in
_VISION_MODEL_KEYWORDS. The image was silently routed to the VL fallback
path instead of being passed directly to the model — users saw the model
respond to a placeholder like "[VL model unavailable - image not analyzed]"
rather than the actual image.

Adds "gemma-4"/"gemma4" and "phi-4"/"phi4" to the keyword list, following
the existing err-toward-True policy (#124): a text-only variant being
treated as vision is the safer failure than dropping a real image.

Fixes #1274 (partial — covers the Gemma 4 + Phi-4 case; the OpenRouter/free
vision fallback path is a separate issue).
2026-06-03 13:36:50 +09:00
Afonso Coutinho
9dd9bb8a3f fix: memory recall crashes on a non-dict row from the vector store (#1705) 2026-06-03 13:35:09 +09:00
Afonso Coutinho
86d3af743a fix: docs RAG query crashes on a non-dict row from the index (#1706) 2026-06-03 13:35:01 +09:00
Afonso Coutinho
076607c9b9 fix: archive browser model filter is suffix-only and drops matching models (#1709) 2026-06-03 13:34:54 +09:00
Afonso Coutinho
56123e052b fix: compacting a chat with image attachments destroys the attachment (#1710) 2026-06-03 13:34:47 +09:00
Afonso Coutinho
f6f86c4b34 fix: research source extraction crashes on a non-dict finding (#1714) 2026-06-03 13:34:40 +09:00
Afonso Coutinho
29e19f326a fix: _resolve_user_upload_path crashes on a non-dict resolve_upload result (#1715) 2026-06-03 13:34:33 +09:00
Afonso Coutinho
55c7a4a546 fix: computeSnap throws when ctx.otherLayers is not an array (#1716) 2026-06-03 13:34:25 +09:00
Mubashir R
319ba50a44 fix: validate client-supplied image _endpoint to prevent SSRF (gallery proxies) (#1718)
POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.

routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:34:17 +09:00
Wes Huber
4baf168df0 docs: fix typo in ROADMAP.md (#1719)
"this is ship" → "this ship"

Fixes #1413

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:34:05 +09:00
Mubashir R
535d05c142 fix: SearchService.search() calls comprehensive_web_search incorrectly (broken public API) (#1720)
SearchService.search() did:

    raw_results = await comprehensive_web_search(
        query, max_results=10 * depth, fetch_content=fetch_content)

comprehensive_web_search is a synchronous function whose count knob is
`max_pages` (not `max_results`) and which has no `fetch_content` parameter, so
the call raised TypeError on argument binding; `await` on its non-coroutine
return would also fail. It returns a context string, or a (context, sources)
tuple with return_sources=True — not the list of dicts the wrapper iterates.

The method is exported in services/search/__init__.py and services/__init__.py
with a usage example in its docstring, so any caller of the documented public
API hit an immediate crash. Call it correctly via asyncio.to_thread with
max_pages + return_sources=True and use the returned source list as the rows.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:33:56 +09:00
lekt8
126e91e8b9 Don't attempt the same (url, model) route twice in the fallback chains (#1733)
The fallback helpers (llm_call_with_fallback, llm_call_async_with_fallback,
stream_llm_with_fallback) build their candidate list as the primary target
followed by the configured fallbacks. Callers prepend the session's live
(url, model) to default_model_fallbacks, so if the user also lists their current
model among the fallbacks — a common misconfiguration — the chain re-attempts
the very route that just failed: a wasted round-trip (and, for the streaming
path, a spurious 'fallback' notice for a switch that didn't actually happen).

Add a small _dedupe_candidates() helper that filters malformed entries and drops
a later repeat of an already-seen (url, model), preserving order (first wins,
keeping its headers). Apply it in all three fallback chains.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:50 +09:00
lekt8
77614e9feb Don't force-include the email toolset on every "tell me" query (#1707) (#1735)
The agent tool-RAG force-includes a keyword hint's tools whenever any of its
keywords appears in the query (word-boundary match). The email-intent hint listed
"tell", which matches a huge fraction of requests — e.g. "visit <url> and tell
me the title" — so the whole email toolset was force-included and crowded out the
relevant tools. The model then saw a prompt dominated by email tools and reported
it had no web search / could not visit the URL.

Remove "tell" from the email keyword set. Genuine email intent still fires on
email/mail/gmail/inbox/unread/message/send/reply.

Test drives get_tools_for_query directly with retrieval stubbed (the keyword
hints are deterministic, no embeddings needed): a "...tell me..." web query no
longer pulls in email tools, a real email request still does.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:33:43 +09:00
Mubashir R
a8a5d6f56e fix: RAG keyword fallback leaked owner-less documents across users (#1722)
VectorRAG.search() filters with ChromaDB where={"owner": owner}, returning only
documents whose owner equals the requesting user. The keyword fallback
(_keyword_search_fallback, used when the primary query raises) guarded with
`if doc_owner and doc_owner != owner: continue`, so a document with a
missing/empty owner fell through and was returned to whichever user issued the
query — a cross-user information leak on the fallback path.

Match the primary path's strict filter: skip any doc whose owner != the
requested owner, including owner-less docs.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:31:33 +09:00
Afonso Coutinho
ada30aa039 fix: evaluate_turn_regex crashes on a non-string agent_reply (#1723) 2026-06-03 13:31:26 +09:00
Afonso Coutinho
290d398900 fix: rewriting a message is lost on reload due to a non-existent DB column (#1729) 2026-06-03 13:31:19 +09:00
Afonso Coutinho
d9e6071528 fix: odysseus-mail read crashes on an empty IMAP fetch payload (#1730) 2026-06-03 13:31:10 +09:00
Afonso Coutinho
c5bc39de88 fix: _extract_entities crashes on a non-string query (#1724) 2026-06-03 13:30:28 +09:00
Afonso Coutinho
0c37943267 fix: search service crashes on a non-dict result row (#1725) 2026-06-03 13:30:19 +09:00
Mubashir R
fefac05ab1 fix: history DB fallback returned hidden (compaction) messages to the client (#1726)
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.

Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:30:11 +09:00
Wes Huber
49885ff9e7 fix(documents): use strip_pdf_content_marker instead of lstrip for PDF auto-open (#1727)
lstrip("\n[PDF content]:") treats the argument as a character set,
not a prefix, so it chews into the following [Page N text]: marker —
e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper
strip_pdf_content_marker (which uses removeprefix) already exists in
the same file and is used by other call sites.

Fixes #1663

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:30:04 +09:00
Mubashir R
4907b16d9b fix: personal-docs path confinement used abspath, allowing symlink escape (#1728)
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 13:29:57 +09:00
Ethan
0e538ecd29 Fix RAG remove_directory wiping the entire shared collection (#1660) (#1734)
Removing one RAG directory destroyed the whole shared ChromaDB collection
(all owners + base index) instead of just that directory's chunks. Shared
root cause: PersonalDocsManager.remove_directory called rebuild_index()
(delete_collection + recreate) then re-indexed only the remaining tracked
dirs (ownerless, never personal_dir). The targeted VectorRAG.remove_directory
that should have been used was itself broken (where={"source":{"$contains":dir}}
selects nothing on scalar metadata and would over-delete siblings), and the
dead do_manage_rag path fired a second unconditional rebuild.

- VectorRAG.remove_directory: select chunks in Python by a path-boundary match
  on the stored absolute `source` (dir or dir+os.sep), abspath-normalized.
  Keys on `source` (always written), never `owner` -- no migration.
- PersonalDocsManager.remove_directory: call the targeted remove instead of
  rebuild_index() + partial reindex.
- do_manage_rag (dead code): drop the second rebuild_index() (hygiene).
- rag_server.py add path: abspath so indexed `source` matches the remove.

No schema change. Prevents future wipes (does not recover already-wiped
vectors). Adds hermetic regression tests at three layers.

Fixes #1660

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:51 +09:00
Wes Huber
9964e9f3fb fix: use correct column name (timestamp) in history_routes queries (#1736)
Three endpoints in history_routes.py ordered by
DbChatMessage.created_at, but the ChatMessage model has no
created_at column — only timestamp. This caused AttributeError
(HTTP 500) on mark-stopped, update-last-meta, and
merge-last-assistant. Other queries in the same file already use
the correct column.

Fixes #1659

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:29:44 +09:00
Ethan
b9c382006e Clamp Anthropic temperature to [0.0, 1.0] in _build_anthropic_payload (#1737)
Anthropic's Messages API rejects temperature > 1.0 with HTTP 400, but
_build_anthropic_payload forwarded it verbatim. The shipped "Nietzsche" preset
uses temperature 1.2 and the UI slider allows up to 2.0, so every Claude request
under such a preset hard-broke. Clamp into [0.0, 1.0] in the Anthropic builder
only (OpenAI keeps its wider 0.0-2.0 range). Covers all three Anthropic call
paths, which build through this one function. None is passed through unchanged.

Fixes #1615

Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
2026-06-03 13:29:36 +09:00
Afonso Coutinho
96a874c604 fix: a non-dict finding silently drops all raw research findings (#1739) 2026-06-03 13:29:29 +09:00
Afonso Coutinho
7f94c43a45 fix: langIcon throws on an explicit null opts argument (#1740) 2026-06-03 13:29:21 +09:00
Afonso Coutinho
fc8efca49d fix: backup import drops a user's memory when its text matches another user's (#1743) 2026-06-03 13:29:14 +09:00
Afonso Coutinho
063e7114e3 fix: youtube transcript formatter crashes on a non-dict segment (#1745) 2026-06-03 13:29:08 +09:00