Commit Graph

790 Commits

Author SHA1 Message Date
Alexandre Teixeira
7ce6ec7f50 fix(tests): use line-level PDF marker assertion
Updates the PDF marker regression test to check corrupted markers at line level instead of using a broad substring assertion. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:20:41 +01:00
WasserEsser
20cc23c9bd fix(models): make pinned models visible in chat UI (#2481)
Two bugs prevented pinned models from appearing in the chat model picker:

1. _fetch_models() only used _cached_model_ids(), ignoring pinned_models.
   Since Fireworks AI doesn't list kimi-k2p6-turbo in /v1/models, the
   cached list was empty, so the endpoint showed as offline with no models.

2. _curate_models() filtered unknown pinned IDs into models_extra, but the
   chat UI only reads models (primary list). Pinned models stayed invisible.

Fix: use _visible_models() to merge cached + pinned, then promote pinned
IDs from models_extra to models so they appear in the dropdown.

Closes #1521 follow-up
2026-06-04 19:17:37 +02:00
Ocean Bennett
34c9a8adb1 docs: point PR checklist at dev (#2594) 2026-06-04 19:15:08 +02:00
Alexandre Teixeira
8bc16ef245 fix(tests): use non-repeating split chunk fixture
Updates the split_chunks containment regression test to use deterministic non-repeating records instead of a repeating fixture that could produce accidental substring matches. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:11:42 +01:00
nubs
050283c145 fix(mcp): confine oauth file paths (#2272) 2026-06-04 19:10:23 +02:00
nubs
935eb05c63 refactor(search): make src analytics a service shim (#2264) 2026-06-04 18:57:24 +02:00
Alexandre Teixeira
3b292403dc fix(tests): accept verify in endpoint HTTP mocks
Updates endpoint/model-route test HTTP mocks to accept the verify keyword argument passed by endpoint probing code. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 17:53:18 +01:00
Kenny Van de Maele
1f00fff837 feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)
Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
2026-06-04 18:37:32 +02:00
Kenny Van de Maele
7443c36bd9 feat: Add edit_file tool + file-change diffs (#1239)
* Add edit_file tool + file-change diffs

edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.

Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
  - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
    so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
    refuses it for non-admin owners.
  - confined by the same path policy as read_file/write_file (allowlist +
    sensitive-file deny) via _resolve_tool_path.

Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.

Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.

Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.

* Drop redundant import os in write_file closure

os is already imported at module top.
2026-06-04 18:29:10 +02:00
Kenny Van de Maele
147d1fbde6 Show the serving provider in the model-info card (#2185)
* Show the serving provider in the model-info card

The model-info popup (click the model name on a message) shows the model
and pricing, with a logo inferred from the model NAME. But the same model
can be served by different endpoints — e.g. claude-haiku via OpenRouter
vs GitHub Copilot vs Anthropic direct — which the name-based logo can't
distinguish.

Add a 'Provider' line derived from the session's endpoint URL:
- new providerLabel(endpointUrl) in static/js/providers.js maps the host
  to a friendly name (GitHub Copilot, OpenRouter, Anthropic, OpenAI,
  Google, AWS Bedrock, DeepSeek, Mistral, Groq, Together, Fireworks,
  Perplexity, xAI), 'Local' for loopback/LAN, else the bare host.
- static/js/chatRenderer.js renders it under Model in the card, from
  window.sessionModule.getCurrentEndpointUrl().

* Anchor provider-label patterns to the hostname

providerLabel matched its patterns against the full endpoint URL with
unanchored substrings, so a host like max.airlines.com matched /x\.ai/ and was
mislabeled "xAI". Anchor each pattern to the end of the hostname ((^|.)domain$)
and test against the parsed host instead of the raw URL.
2026-06-04 18:22:31 +02:00
Kenny Van de Maele
8bfd79fe8e chore: deduplicate src/search modules (cache, content, query) into shims (#2506)
* chore: dedupe src/search/cache.py into a re-export shim

src/search/cache.py was a byte-identical copy of services/search/cache.py.
Convert it to a sys.modules alias of the canonical services module (matching
src/search/core.py, providers.py, ranking.py) so the two cannot drift, and add
an identity assertion to test_search_module_consolidation.py.

content.py and query.py are intentionally left as-is: the copies have drifted
and services lacks fixes that src has, so they need services reconciled first
before they can be shimmed safely.

* chore: dedupe src/search content.py and query.py into shims

Convert src/search/content.py and query.py to sys.modules aliases of the
canonical services/search/* (matching cache.py, core.py, providers.py,
ranking.py) so the duplicate copies cannot drift.

Repoint the two tests that were coupled to the src-copy internals onto the
canonical services surface (behaviour is equivalent):
- test_src_search_query_nonstring.py: import services.search.query instead of
  loading the src file by path.
- test_security_regressions.py::test_web_fetch_guard_blocks_redirect_into_private:
  mock httpx.get (services uses the module-level get, not httpx.Client) and
  assert on the canonical 'Blocked' message.

Drop the now-redundant [src_content, service_content] parametrization in
test_search_content_extraction_parity.py and test_search_content_url_guards.py
(after the shim both params are the same object); add content/query identity
assertions to test_search_module_consolidation.py.
2026-06-04 18:10:55 +02:00
Kenny Van de Maele
66fba78011 fix: live-resume chat stream on session re-entry (#2539) (#2561)
* fix: live-resume chat stream on session re-entry (#2539)

When a session was re-entered after a page refresh or in a new tab while
its agent run was still streaming, the UI showed a frozen "Generating
response..." spinner, polled stream_status until the run finished, and
then did a full reload. The live tokens were never shown.

Add resumeStream() in chat.js: it consumes GET /api/chat/resume/{id}
(which replays the run's buffer then streams live), renders reply tokens
as they arrive, and reloads the session on completion for the canonical
final render. sessions.js _checkServerStream now calls it on re-entry and
falls back to the previous spinner+poll path if it is unavailable.

* Finalize plain-text resume in place instead of reloading

On stream completion, resumeStream() called selectSession(), forcing a full
history re-fetch and a visible flicker right as the stream finished.

For plain text replies (no tool calls, sources, doc streaming, or multi-round
output) the live tokens are already rendered, so finalize in place: replace the
live bubble with a canonical single message via chatRenderer.addMessage (markdown
+ footer actions + metrics, the same renderer history uses), captured from the
streamed metrics event. No history refetch, no extra round-trip, no flicker.

Rich responses still reload, since their canonical render (tool bubbles, sources,
multi-bubble) is rebuilt from the saved DB record.

* Use a dedicated set for the resume re-attach lock; fix stale docblock

resumeStream() marked its re-attach lock in _backgroundStreams, which
checkBackgroundStream() also reads. On a second re-entry of the same session
while a resume was still live, checkBackgroundStream() mistook that entry for a
same-tab POST stream and spawned its own spinner+poll bubble. Move the lock to a
dedicated _resumingStreams set (also covered by hasActiveStream) so the two paths
no longer collide. Also update the resumeStream docblock to describe the
in-place finalize vs reload split.
2026-06-04 17:56:15 +02:00
Nicholai
c916224510 feat(memory): add provider interface (#72) 2026-06-04 16:26:11 +01:00
Kenny Van de Maele
a7e60ca7ec Merge pull request #2214 from vdmkenny/chore/rm-unused-upload-dir-import
chore: remove unused UPLOAD_DIR imports in document_routes
2026-06-04 17:11:15 +02:00
Kenny Van de Maele
dfccd8256d Merge pull request #2218 from vdmkenny/chore/rm-unused-uuid-import
chore: remove unused uuid import in app.py
2026-06-04 17:10:29 +02:00
Kenny Van de Maele
07e69ddf84 Merge pull request #1966 from vdmkenny/ci-checks
feat(ci): add CI workflow (syntax + tests)
2026-06-04 16:54:32 +02:00
Alexandre Teixeira
dd1fa7e1c4 refactor(tests): add shared CLI test helpers
Adds shared test helpers for CLI script loading and scoped core.database stubs, then converts a low-conflict pilot set of CLI tests. Part of #2523.
2026-06-04 15:44:25 +01:00
raf
cf5c5118d8 fix(hwfit): return no_fit instead of None when target_quant is a GGUF tier on multi-GPU (#2375)
The multi-GPU GGUF filter at fit.py:380 returned None unconditionally
for Q*/IQ quants on 2+ GPU systems. When the caller explicitly passes
target_quant, they are asking 'what happens if I try this?' and expect
a structured no_fit response, not a silent None.

Fix: skip the filter when target_quant is explicitly provided so the
call falls through to the existing no_fit path.

Fixes #
2026-06-04 14:25:36 +01:00
ooovenenoso
e5d3f2211b fix(document): render Mermaid in markdown preview (#2415) 2026-06-04 14:25:15 +01:00
Wes Huber
93b3e108a6 fix: re-export _SPORTS_HINT_RE from search ranking shim (#2273)
The compatibility re-export shim at src/search/ranking.py forgot
_SPORTS_HINT_RE, so tests importing src.search.ranking raised
AttributeError on the [src] parametrize variant.

Fixes #1995

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 14:24:53 +01:00
raf
d3e6935d62 fix(tests): update search service mock to match current API signature (#2334)
comprehensive_web_search now called with (query, max_pages, return_sources)
and returns a tuple (_context, results). The test mock still used the old
async signature with max_results/fetch_content and returned a plain list,
causing TypeError on every run.

Fixes #2331
2026-06-04 14:19:51 +01:00
Fellah Youssef
e92719263e feat(ui): allow expanding consolidated file chip regardless of count (#1849) (#2086) 2026-06-04 14:02:52 +01:00
Giuseppe
bc9104efe2 fix: SSE stream parser crashes with NoneType on providers sending null choice/usage/tc entries (#2389)
* fix: SSE parser crashes with NoneType on MiniMax-M3 (and any provider sending null choice/usage/tc)

Three guards added in stream_llm:

1. choices[0] null check — MiniMax (and some other providers) send a
   choices entry as None. `_choices[0].get("delta")` raised
   AttributeError. Now checks `_choices[0] is not None` before calling
   .get().

2. usage null guard — j["usage"] can arrive as None (not a dict) on
   some providers. Added `or {}` so subsequent .get() calls don't crash.

3. tool_calls null entry skip — individual entries in the tool_calls
   array can be None. Added `if tc is None: continue` before
   tc.get("function").

All three match the `or {}` / null-guard pattern used elsewhere in the
same block. Safe for all OpenAI-compatible providers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard null choice in elif-choices SSE branch

The usage-chunk path already guarded _choices[0] is not None, but the
elif "choices" branch that processes content/tool-call deltas did not.
A chunk like {"choices": [null]} or {"choices": [null], "usage": null}
reaches j["choices"][0].get("delta") and crashes with:

    'NoneType' object has no attribute 'get'

Fix: extract choices[0] into _c0 and continue to the next chunk when
it is None, matching the guard already applied in the usage path.

Adds three focused regressions covering the paths the maintainer flagged:
- {"choices": [null]}
- {"choices": [null], "usage": null}
- tool_calls array containing a null entry alongside a valid call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 13:53:10 +01:00
Joeseph Grey
fa1fe7f866 security: sanitize rendered research-report HTML (#364)
The visual research report is assembled from LLM output over crawled web
pages (untrusted content) and served under a relaxed `script-src
'unsafe-inline'` CSP. Two values reached that HTML without sanitization:

- `_md_to_html` rendered the report markdown via python-markdown, which
  passes raw HTML through verbatim, so `<script>` / `<img onerror>` /
  `<svg onload>` / `javascript:` links carried in crawled content ran in
  the app origin.
- `category` (from the /api/research/start request body, no enum check) was
  interpolated raw into `<body class="category-{category}">`.

Allowlist-sanitize the rendered markdown with nh3, keeping the formatting
the report emits (tables, code, details/summary, toc anchors, codehilite
classes, external-link target/rel) while dropping active content, and
html.escape the category. Adds regression tests.
2026-06-04 13:42:49 +01:00
Massab K.
594775dc4b Fix issue 135 chat context bleed (#281)
* Fix issue 135 chat context bleed

* Guard task delivery metadata access
2026-06-04 13:27:46 +01:00
Alexander Kenley
7b45a94b6d Fix calendar routing and user-local time context (#408)
* fix(chat): add user-local time context

* fix(chat): route calendar follow-up phrasing

* refactor(chat): log tool intent routing reasons

* test(chat): align user time prompt shim

---------

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-04 13:20:04 +01:00
tanmayraut45
f59edee611 Support extra CA bundle for private-CA LLM providers (#769)
Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a
model endpoint fails on first probe with

    CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate
    chain (_ssl.c:1000)

because their TLS chain is signed by a private root CA (Russian Trusted
Root CA for GigaChat; corporate CA for on-prem) that isn't part of the
default system / certifi trust store. The endpoint shows offline in
the picker even though the URL and API key are correct (issue #722).

The right fix is to extend the trust store, not to weaken verification.
This change:

- src/tls_overrides.py: new module that resolves an opt-in env var
  LLM_CA_BUNDLE at import time, builds a shared SSLContext via
  ssl.create_default_context() (so the system / certifi bundle is
  loaded first) and layers the operator's PEM on top with
  load_verify_locations(). Exposes llm_verify() returning a value
  suitable for httpx `verify=`. Defaults to True (httpx built-in
  trust) when the env var is unset, when the file is missing, or
  when the PEM fails to load — verification is never silently
  disabled, the warning is logged and we fall back to the safe path.

- src/llm_core.py: thread llm_verify() into the shared AsyncClient
  used by stream_llm / streaming completions.

- routes/model_routes.py: thread llm_verify() into the five httpx.get
  call sites in _probe_endpoint / _ping_endpoint so adding a
  private-CA endpoint goes green on the very first probe and the
  picker stops showing it offline.

- .env.example: document LLM_CA_BUNDLE with the GigaChat case as the
  concrete example.

Deliberately NOT included: a verify=False knob (global or per-host).
Disabling verification exposes the affected endpoint to MITM, and the
operator-supplied bundle is the correct fix for legitimate private-CA
providers — so the only switch in this PR is the safe one.

Closes #722.
2026-06-04 13:18:50 +01:00
SHORYA BAJ
f876fc7704 fix(cookbook): don't mark successful dependency installs as crashed (#1315)
Pip dependency installs are tracked as download tasks but finish with the
runner's "=== Process exited with code 0 ===" sentinel and pip's
"Successfully installed" line — never the HuggingFace download markers
(DONE / 100% / /snapshots/ / DOWNLOAD_OK) the download heuristics look for.

Once the tmux pane is gone, the backend's only completion check is the HF
cache lookup, which a pip package (e.g. llama-cpp-python[server], no "/")
never matches, so it reports "stopped" — and the frontend maps a stopped
download to "crashed". The reconnect loop's session-gone heuristic had the
same gap. Result: a clean install (exit 0) showed "crashed" in the Running
tab while the Dependencies tab correctly showed it installed.

Add a shared _depInstallSucceeded() helper that keys off the exit-0
sentinel (falling back to pip's success line, rejecting ERROR/Traceback)
and wire it into both the session-gone heuristic and the background status
reconciler, gated on payload._dep so real model downloads are unaffected.

Also fixes the pre-existing test_background_status_poll_reconciles_into_local_tasks
assertion that no longer matched the evolved reconciler, and adds regression
coverage for both paths.
2026-06-04 12:55:06 +01:00
ghreprimand
28c43121d7 Fix session export 500 on multimodal/None message content (#1984)
txt/html/md export joined and string-munged message.content directly, so a
multimodal turn (content is a list of blocks) crashed export with a TypeError
on join (txt) / AttributeError on .replace (html), and None content (tool-only
assistant turns) rendered as the literal 'None'. Add a _content_to_text helper
that flattens string/list/None to plain text and apply it at the three export
sites. JSON export is unchanged (it serializes structured content correctly).
Plain-string content is returned unchanged, so existing exports are identical.

Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-04 12:53:44 +01:00
pewdiepie-archdaemon
041c03bf11 Add dev/main branch model: PRs target dev, main is curated
Switching to a two-branch workflow: contributors open PRs against `dev`,
and `main` is fast-forwarded to a tested `dev` commit at each release.
This separates "things land in staging" (can move fast) from "things
ship to users" (slow, tested in a browser by the maintainer first).

CONTRIBUTING: add a Branch model section explaining the split + how to
retarget a PR.
PR template: add an explicit "this PR targets dev" checkbox at the top
so it's the first thing a contributor confirms.

End-users cloning the repo will now land on `dev` by default; they can
`git checkout main` if they want the curated branch.
2026-06-04 20:52:56 +09:00
Alexandre Teixeira
f2b11ba94e tools: add read-only PR blocker audit helper
Adds a standalone read-only PR blocker audit helper with Markdown, terminal, and JSON output plus focused tests and documentation.
2026-06-04 12:51:48 +01:00
Giuseppe
f6a5f6592f fix: log warnings on silently swallowed agent and endpoint failures (#2367)
get_builtin_overrides() was swallowing all exceptions with a bare
`except Exception: pass`, so misconfigured tool-description overrides
would silently produce wrong agent behaviour with no log trace.

The background endpoint refresh loop had the same pattern: any probe
failure was silently ignored, giving operators no signal that the
refresh was broken.

Also removes a circular self-import (`from src.agent_loop import
_build_base_prompt`) inside _build_system_prompt; the function is
already in scope and the import created a latent circular reference risk.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 12:29:31 +01:00
Kenny Van de Maele
2c7349503a chore: remove unused uuid import in app.py
app.py imports uuid but never uses it (pyflakes: 'uuid imported but
unused'). Drop the dead import — no behaviour change.
2026-06-04 13:17:23 +02:00
Kenny Van de Maele
e9dfd1a747 Remove unused UPLOAD_DIR imports in document_routes
routes/document_routes.py imports UPLOAD_DIR from src.constants in 8
separate function bodies but never uses it (pyflakes: 'imported but
unused' ×8). Drop the dead imports — no behaviour change.
2026-06-04 13:17:21 +02:00
Kenny Van de Maele
379a60e5d6 Add CI workflow for syntax + test checks
.github/workflows/ci.yml runs on push to main + PRs:
- python-syntax: compileall over app.py + core/routes/src/services/scripts/tests
- node-syntax: node --check on our JS (static/app.js + static/js)
- python-tests: pip install + pytest (continue-on-error for now)

Hardening: least-privilege `permissions: contents: read`, a `concurrency`
group that cancels superseded runs, and actions pinned to commit SHAs
(version in a comment) instead of mutable tags.
2026-06-04 13:17:08 +02:00
Kenny Van de Maele
d25052e43f chore: remove unused imports in calendar_routes (#2221)
routes/calendar_routes.py imports several names it never uses (pyflakes):
typing.Tuple, dateutil.rrule.{rruleset,DAILY,WEEKLY,MONTHLY,YEARLY}, and
auth_helpers.get_current_user. Drop them (the whole DAILY/WEEKLY/MONTHLY/
YEARLY line goes; rrulestr and require_user are kept). No behaviour change.
2026-06-04 12:13:18 +01:00
Giuseppe
68cb715914 fix(endpoint): import ModelEndpoint from core database
ModelEndpoint is defined in core.database, not src.database. The wrong
import silently prevented the module from loading in deployment
configurations that do not have a src/database.py shim, resulting in an
ImportError at startup.

Also adds a warning log when resolve_endpoint finds no usable model
(all models hidden or the list is empty), making the otherwise-silent
failure visible in operator logs.

The test_auth_regressions stub for src.endpoint_resolver was missing the
build_models_url attribute, which caused test collection errors.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 11:51:47 +01:00
Sahitya Madipalli
88754035ce fix(cookbook): stop-all no longer auto-retries interrupted HF downloads fixes (#1474)
* fix(cookbook): stop-all no longer auto-retries interrupted HF downloads

When C-c was sent to a running download, the bash wrapper printed
DOWNLOAD_FAILED on non-zero exit (SIGINT = 130). The reconnect polling
loop was still running at that point, saw the failure marker, and
silently relaunched the download — making "Stop all" appear to have no
effect while the UI showed the toast as if it succeeded.

Fix: abort the reconnect controller immediately when the stop button is
clicked (before the kill command is dispatched), and guard the
auto-retry condition with !controller.signal.aborted so that any
in-flight poll that completes after abort cannot trigger a retry.

Fixes #1458

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix Edge/Chromium sidebar section-title clipping (#1420)

Sidebar section titles were vertically clipped in Chromium/Edge (fine in
Firefox). Raise line-height 1 → 1.3, mirroring the existing .list-item fix.
The titles are flex-centred in a fixed-height (29px) header, so this adds
glyph headroom without any reflow.

* Drop GPU-only flags from the CPU-only (-ngl 0) serve command (#1433)

A CPU-only llama.cpp serve config still emitted --flash-attn on and exported
GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 (independent toggles, often left on by an Auto
profile), so the command mixed "zero GPU layers" with CUDA/flash-attn and failed
to start (issue #1291). Gate both on a _cpuOnly check (ngl == 0). GPU serving is
unchanged — the gate only affects the ngl=0 path.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: APIKeyManager.load crashes app startup on a corrupt/wrong-shape api_keys.json (#1565)

* Don't lose deep-research findings when synthesis times out (#1551) (#1562)

Two problems made deep research report "No information could be gathered" even
after it had extracted findings, on slow local models (reporter served a 20B
via LM Studio):

- _synthesize hard-capped its LLM call at timeout=60, while extraction uses the
  user's extraction_timeout (300s here) and the final report uses 180s. The slow
  model needed >60s to synthesize the round's findings, so synthesis timed out
  after 3 attempts. Raised it to 180s to match the final-report call.

- When synthesis produced no report (it returns the unchanged, still-empty
  report on failure during round 1), the run hit
  `if not report: return "No information could be gathered…"` and discarded the
  findings it had already gathered. Now it falls back to a compiled report built
  from those findings (_fallback_report) so the user keeps the gathered material.

Tests stub the LLM (no live model/DB), pin the synthesis timeout >= 180, that the
fallback surfaces the findings rather than the give-up message, and that a failed
synthesis preserves the previous report.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: return sorted model list on first call in group chat (#1484)

Both _getModels() and getAllModels() store the sorted copy in a cache
variable but return the original unsorted array on first invocation.
Subsequent calls return the cache (sorted), causing inconsistent
model picker ordering on first render.

* fix: guard sp.destroy() in _loadScheduled against null spinner (#1495)

When the scheduled folder is opened with cached data, sp is null
(the loading spinner is skipped). _loadScheduled receives null and
calls sp.destroy() unconditionally, crashing with TypeError.

* fix: capture download exit code before test consumes it (#1497)

The shell pattern 'if [ $? -eq 0 ]; ... else ... echo DOWNLOAD_FAILED (exit $?)' always reports 'exit 1' because $? inside the else branch is the exit code of the [ test command, not the download. Capture into _ec first.

* fix: guard uid.decode() in auto-classify warning log against str UIDs (#1472)

Every other uid.decode() call in this function uses
'uid.decode() if isinstance(uid, bytes) else str(uid)' but the
warning at line 832 does bare uid.decode(), crashing with
AttributeError when uid is already a string.

* fix: guard AI tidy verdict against non-string LLM output (#1486)

The AI document-tidy endpoint parses verdicts from LLM JSON output
and calls .lower().strip() directly. If the model returns null or a
non-string element, this crashes with AttributeError. Coerce to str
so malformed output is treated as 'keep' instead of crashing.

* fix: rename local url-quote import to avoid shadowing module-level _q (#1471)

The 'from urllib.parse import quote as _q' at line 734 shadows the
module-level _q (istrstrstrstrstrstrIMAPutility) imported from email_helpers, causing
UnboundLocalError at lines 191 and 278 where _q is used before the
local import executes. This silently breaks the entire auto-summarize
pass.

* fix(ui): add missing Escape key handlers for email-lib-modal, model-picker-menu, and sort dropdowns (#1487)

CONTEXT: Several interactive elements lacked Escape key handlers: the email library modal was not in dynamicModals, the model-picker popup had no Escape close, and the session/model sort dropdowns only closed on outside click.

CHANGE: Adds email-lib-modal to the dynamicModals array in the Escape handler so it gets dismissed via dismissModal. Adds a check for model-picker-menu.open before the modal chain to close the dropdown on Escape. Adds checks for session-sort-dropdown and model-sort-dropdown display=block before the document panel minimize fallback.

WHY: Users expect consistent Escape-to-close behavior across all modals, overlays, and popups. These four were the only interactive containers in the app that ignored the Escape key entirely.

IMPACT: Pressing Escape now closes the email library modal, model picker popup, session sort dropdown, and model sort dropdown -- matching user expectations and the behavior of every other modal in the app.

* fix: mcp CLI _serialize crashes when stored env JSON is a list (#1609)

* fix: validate_caldav_url crashes with TypeError on a non-string URL (#1608)

* fix: _sanitize_export_filename crashes on a non-string session name (#1607)

* fix: shared MCP truncate() crashes on None/non-string tool output (#1605)

* fix: search query helpers crash on a non-string query (#1604)

* fix: rag_server add/remove_directory crashes on a non-string directory arg (#1614)

* fix: gallery CLI image serialization crashes on a non-string prompt (#1598)

* fix: research CLI summary crashes on a non-string query (#1596)

* fix: skills CLI summary crashes on a non-string description (#1595)

* fix(cookbook): set UTF-8 encoding for detached download/serve subprocesses (#1599)

On Windows, Python defaults to the active code page (cp1252) for
subprocess I/O. HuggingFace CLI outputs U+2713 (✓) when validating
tokens, which cp1252 cannot encode, crashing the download process.

Set PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 in the subprocess
environment so Unicode output from hf/pip/llama-server is handled
correctly.

Fixes #1543

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify host Ollama with Docker (#1594)

* fix(ui): stop welcome-screen tip from clipping on narrow phones (#1612)

The empty-state tip ("Add an AI endpoint from Settings...") shares a 60px
max-height ceiling with the one-line .welcome-sub / .welcome-version. On
narrow phones the welcome block shrink-wraps and the tip wraps to 4-5 lines
(~67px), so the shared ceiling clipped its last line ("...key into the
chat.") - the only setup hint a first-run user gets.

Give .welcome-tip its own taller max-height (120px), placed above the
@media (max-height: 650px) block so that rule's max-height:0 still collapses
the tip on short viewports. .welcome-sub / .welcome-version are untouched,
and desktop is unchanged (the tip is ~50px there, well under the ceiling).

* Save only string personal doc paths (#1566)

* Reject backup output inside data dir (#1587)

* Parse all AMD GPU check args (#1586)

* Require runnable dispatcher subcommands (#1585)

* Require runnable dispatcher subcommands

* Use modern dispatcher test loader

* Remove duplicate update database body (#1584)

* Skip invalid research service sources (#1583)

* Reject CalDAV writeback events without uid (#1582)

* Reject empty mail CLI recipients (#1581)

* Reject empty mail CLI recipients

* Keep mail CLI test imports isolated

* Validate signature CLI PNG data (#1580)

* Validate signature CLI PNG data

* Keep signature CLI test imports isolated

* Reject invalid preset CLI entries (#1579)

* Reject invalid preset CLI entries

* Use modern preset CLI test loader

* Normalize session CLI counters (#1578)

* Normalize session CLI counters

* Keep sessions CLI test imports isolated

* fix: monthly schedule label shows 21th/22th/31th (ordinal suffix for days >20) (#1577)

* fix: split_chunks emits a duplicate trailing chunk for text over size-overlap (#1573)

* fix: builtin_actions heuristics crash on a truthy non-string input (#1639)

* fix: skill test-task / precision helpers crash on a non-dict skill (#1638)

* fix: logs CLI _resolve crashes on a non-string name (#1631)

* fix: _extract_skill_json crashes on a truthy non-string teacher response (#1630)

* fix: tool-block parsing crashes on a non-string input (#1628)

* fix: check_outbound_url crashes on a truthy non-string URL (#1623)

* fix: document_actions title/content helpers crash on non-string input (#1621)

* fix: inside_base_dir raises TypeError on a non-string path instead of failing closed (#1619)

* fix: is_markitdown_format crashes on a non-string path (#1618)

* Close app_api blocklist gap for bare /api/tokens and /api/users

The blocklist prefixes had trailing slashes, so path.startswith() only
matched /api/tokens/{id} but not /api/tokens itself — the bare GET (list)
and POST (mint) endpoints were reachable via app_api. Same gap on
/api/users (list/create/delete). Drop trailing slashes so both bare and
sub-resource forms are blocked. /api/auth and /api/admin had no bare
endpoints today but get the same treatment to prevent future drift.

Caught by #1462.

* Decrypt CalDAV password before write-back (#1731)

writeback_event read cfg["password"] (the encrypted blob) and passed it
straight to DAVClient, so every local create/edit/delete authenticated
with the literal ciphertext, the remote rejected it, and the change
never reached the server — the exact silent-write-loss this module was
built to prevent. The pull path src/caldav_sync.py already decrypts;
mirror that. decrypt() is a no-op on legacy plaintext.

Caught by #1731.

* Memory MCP delete: match exact id, not prefix (#1303)

The delete action looked up the target with startswith() to capture
full_id, but then re-applied startswith() to filter the list — so a
short or ambiguous memory_id silently deleted every memory whose id
shared the prefix, while the success message reported only the first
match. The edit action used the first match and stopped, so the two
actions disagreed on multi-match behaviour. Use full_id for both.

Caught by #1303.

* Rebuild memory vector index from the full saved set, not just the audited owner (#1747)

audit_memories saves final_entries merged with other owners' entries
(correct), but then rebuilt the shared vector collection from
final_entries alone — wiping every other owner from semantic search
until they happened to run their own audit. Keyword fallback masked
it, so it degraded silently. Capture saved_entries once and rebuild
from that.

Caught by #1747.

* Owner-scope RAG doc ids so identical chunks across users don't collide (#1738, #1760)

_generate_doc_id hashed only text. add_document / add_documents_batch
early-return when the id exists, so the second owner indexing a
byte-identical chunk hit the first owner's id, was silently dropped,
and never stored under their owner — their owner-filtered search then
quietly omitted it. Hash owner + text; empty owner reproduces the
legacy id, so the unowned/base index keeps existing ids and isn't
re-churned. Same-owner identical chunks still dedupe.

Caught by #1738 and #1760 (independent reports of the same bug).

* Removed duplicate definition of _preview_text()

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Zeus-Deus <100132710+Zeus-Deus@users.noreply.github.com>
Co-authored-by: lekt8 <lewistham9x@gmail.com>
Co-authored-by: Afonso Coutinho <afonso@omelhorsite.pt>
Co-authored-by: Paulo Victor Cordeiro <146781332+pvcordeiro@users.noreply.github.com>
Co-authored-by: Zarl-prog <asimjunaidi5u@gmail.com>
Co-authored-by: Wes Huber <wesleybaxterhuber@gmail.com>
Co-authored-by: .bulat <its.bulat@icloud.com>
Co-authored-by: Mahdi Salmanzade <mahdisalmanzadehasl@gmail.com>
Co-authored-by: red person <redpersoncoding@gmail.com>
Co-authored-by: pewdiepie-archdaemon <pewdiepie-archdaemon@users.noreply.github.com>
2026-06-04 11:48:39 +01:00
Marius Popa
dc365a1b27 Fix Ollama agent single-token responses (#1591)
Agent mode treated local /v1 endpoints, including Ollama on :11434, as native-tool-capable by host/model heuristics. On Ollama's OpenAI-compatible surface some models that advertise tool support stop after a single token when schemas are sent (issue #1567). Default local Ollama /v1 back to fenced tool blocks unless the endpoint explicitly has supports_tools=True.

Also compare both the runtime chat URL and the normalized endpoint base when reading ModelEndpoint.supports_tools. That keeps a saved base URL such as http://localhost:11434/v1 effective when the active session URL is /v1/chat/completions.

Tests: .venv/bin/python -m pytest tests/test_tool_support_heuristic.py
2026-06-04 11:45:10 +01:00
Wes Huber
965185c6f9 fix(tests): pre-import real sqlalchemy/database in conftest to prevent stub contamination (#2398)
Some test files (e.g. test_llm_core_sanitize_tool_calls) stub
sqlalchemy and core.database at module level with
`if mod not in sys.modules`. During pytest collection these stubs
fire before the real modules are imported, contaminating every
subsequent test that needs real ORM objects (IntegrityError, missing
columns, etc.).

Pre-import the real modules in conftest.py so the module-level
guards find them already loaded and skip the stubs. Fixes ~10+
cascading test failures that only appear in the full suite.

Fixes #2395

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 11:39:49 +01:00
ooovenenoso
e163384015 fix: treat Nix files as readable uploads (#2249) 2026-06-04 12:06:24 +02:00
Povilas Kirna
68eeb7841c ci: harden description checks — unfilled dropdowns, gameable test plans, non-issue links (#2099)
* ci: harden description checks (dropdown placeholder, how-to-test, link \b)

- issue: flag sections still showing the "-- Please Select --" dropdown
  placeholder (added in #2068) as a single comma-separated line item;
  presence-only checks previously let an un-chosen dropdown pass.
- PR: replace the numbered-step "How to Test" rule with a non-trivial
  content requirement (>=30 chars). The old /\d+\.\s*\S/ rule both
  false-failed prose/code-block test plans and was gamed by an empty
  "1. 2. 3." shell; the message now explains what detail to provide.
- PR: tighten the linked-issue regex to /#\d+\b/ so a hex colour like
  #1a2b3c no longer counts as an issue reference.

---------

Co-authored-by: Povilas Kirna <povilas.kirna@pebble.net>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 08:16:36 +02:00
Nicholai
4dc11cfe6b refactor(memory): canonicalize memory imports (#50) 2026-06-04 05:31:15 +01:00
Yuri
a2e691da2b fix(models): stabilize proxy endpoint refresh behavior
* fix: support large proxy model endpoint refresh

Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.

Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.

Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.

* fix: preserve endpoint ping status semantics
2026-06-04 04:56:11 +01:00
Sushanth Reddy
eee2167502 Stop API key save() from writing other providers' keys as plaintext (#1944)
save() called load(), which DECRYPTS every stored key, then re-encrypted
only the key being saved and wrote the whole dict back. The other
providers' keys were thus persisted in plaintext; on the next load()
Fernet raised InvalidToken on them and they were silently dropped.

Add _load_raw() that returns the still-encrypted on-disk dict (reusing the
existing missing/corrupt-file guards) and have save() build on that, so
untouched providers keep their ciphertext. load() now also goes through
_load_raw(), keeping its behavior identical.

Fixes #1914

Co-authored-by: EkaTantra Dev <dev@ekatantra.com>
2026-06-04 04:47:13 +01:00
Afonso Coutinho
09fe308720 fix(auth): revoke API tokens when deleting users
* fix: revoke API bearer tokens when their owner is deleted

* Re-run CI

* Invalidate bearer-token cache on user delete so warmed cached tokens stop working
2026-06-04 04:44:34 +01:00
Marius Popa
666babfd58 fix(documents): refresh library counters after removal (#1924) 2026-06-04 04:42:23 +01:00
Rudy Wolf
1c43daa564 fix(compare): stop blind mode leaking model identities via session names (#1318)
Blind Compare anonymized the pane headers, but each pane still created a helper chat session named "[CMP] <real-model>" and GET /api/sessions returned the session's model field. So the sidebar and the session-list API let a user map "Model A" back to its real model before voting, defeating the blind test.

- Frontend (static/js/compare/index.js, panes.js): in blind mode, name helper sessions by their neutral slot ("[CMP] Model A") instead of the model, matching the existing blind pane labels.
- Backend GET /api/sessions (routes/session_routes.py): blank the model field for [CMP]-prefixed helper sessions via a new _public_model helper.
- Backend /api/compare/start (routes/compare_routes.py): name blind sessions by slot and withhold model_left/model_right/mapping from the blind response (revealed at /vote).
- Tests: tests/test_blind_compare_redaction.py.

Fixes #1285.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 04:39:01 +01:00
hawktuahs
3d8c364689 [Bash] Fix Windows cookbook background tasks (#676)
* Fix Windows cookbook background tasks

* Add Windows Cookbook reliability follow-ups
2026-06-04 04:30:01 +01:00
Wes Huber
0f7ea7a936 fix: add 'willing to fix' dropdown to bug report issue template (#2063)
* fix: add 'willing to fix' dropdown to bug report issue template

The feature request template has an 'Are you willing to implement
this?' dropdown but the bug report template was missing it, leaving
a plain textarea with a placeholder hint instead. Add a matching
dropdown for consistency.

Fixes #2059

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add '-- Please Select --' default option to match feature_request template

Rebased on #2068 and added the placeholder option for consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 04:25:04 +01:00