Commit Graph

504 Commits

Author SHA1 Message Date
Kenny Van de Maele
67782e684e fix: exclude slash-command/setup messages from LLM context (#2634) (#2640)
Slash-command replies and the echoed /setup command are persisted to session
history so they render in the transcript, but they are UI chatter the user
never meant as conversation. They were sent to the model on the next turn,
which then commented on '/setup ...' and exposed transient values (e.g. the
Copilot device user_code) to the LLM.

- get_context_messages() (the LLM-API view) now skips messages tagged
  metadata.source == 'slash'. Display/history-load paths use raw history and
  are unaffected.
- slashCommands.js tags the echoed user command with source:'slash' too (the
  assistant replies already carried it); the user line was the one untagged
  path that still reached context.

Fixes #2634.
2026-06-04 21:42:23 +02:00
Afonso Coutinho
baf9179d94 Fix truncate_messages persisting an inflated message_count (#2052)
truncate_messages deletes db_messages[keep_count:] (a no-op when
keep_count >= the real message total) then unconditionally wrote
db_session.message_count = keep_count. When keep_count exceeds the
number of messages that actually exist — e.g. the manage_session AI
tool defaults keep_count to 10, and the HTTP truncate endpoint passes
any client value — the persisted count is set too high (10 on a
3-message session), diverging from the real row count. That column
gates lazy DB-hydration in get_session (message_count > 0) and is
surfaced to the history UI, so it is correctness-relevant. Clamp to
min(keep_count, len(db_messages)); the in-memory slice already caps
naturally.
2026-06-04 21:19:16 +02:00
Kenny Van de Maele
1cd0aa2b8c feat(provider): add GitHub Copilot provider with device-flow auth (#1480)
* feat(provider): add GitHub Copilot provider with device-flow auth

Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5,
Claude, Gemini, …) work through the normal chat + agent loop, incl. native
tool calling and vision.

Auth is one-click via the GitHub OAuth device flow; the access token is stored
as the endpoint's (encrypted) api_key and sent directly as `Authorization:
Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to
the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only
provider-specific behaviour is a small set of required request headers,
injected centrally.

Sign-in is available from Settings → model endpoints ("Connect GitHub
Copilot") and from chat via `/setup copilot`.

- src/copilot.py (new), routes/copilot_routes.py (new): constants, header
  builders, device-flow start/poll, model discovery, owner-scoped endpoint
  provisioning.
- src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers,
  per-request x-initiator/vision.
- src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas.
- src/model_context.py: known context windows for Copilot (no unauthenticated
  /models probe).
- static/, README, tests/test_copilot*.py.

* Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process
2026-06-04 21:13:14 +02:00
Ocean Bennett
ca32b43b38 fix(history): tolerate tool-call turns during compact (#2626) 2026-06-04 20:59:41 +02:00
Vykos
9964f1382f Isolate HTML popup openers (#2501) 2026-06-04 20:52:41 +02:00
Vykos
ca8ca38a32 Guard image and QR DOM attributes (#2500) 2026-06-04 20:51:23 +02:00
Vykos
b59bbe80ce Harden chat streaming DOM sinks (#2498) 2026-06-04 20:49:37 +02:00
Vykos
e113c10d01 Harden email HTML URL sanitization (#2496) 2026-06-04 20:47:47 +02:00
Vykos
01c99c3990 Harden markdown raw HTML sanitization (#2497) 2026-06-04 20:46:10 +02:00
Vykos
3ae89599f3 Whitelist research source links (#2499) 2026-06-04 20:41:35 +02:00
Afonso Coutinho
ed933ac232 fix: renaming a user leaves their API tokens resolving to the old owner (#1932)
* fix: renaming a user leaves their API tokens resolving to the old owner

* Drive rename token-cache test through the real auth resolver instead of patching a closure
2026-06-04 20:37:59 +02:00
ooovenenoso
ab5311c44d fix(research): support timeout defaults in direct tests (#2624)
fix(research): honor planning query timeouts
2026-06-04 20:23:17 +02:00
Giuseppe
6d511f6e66 fix(llm): auto-detect <think> in content stream for unregistered thinking models (#2588)
* fix(llm): auto-detect <think> in content stream for unregistered thinking models

_THINKING_MODEL_PATTERNS only covers known model families by name. Qwen3-derived
models with non-standard names (e.g. Qwopus, custom QwQ forks) are not matched,
so their <think>...</think> content streams through as visible chat text instead
of being routed to the thinking display.

When the first content delta opens with <think> and the model was not already
identified as a thinking model, dynamically flag the stream as a thinking model
for the remainder of the response. This enables the existing </think> repair path
(line below) and ensures the frontend receives the full <think>...</think> wrapper
it needs to split thinking from the final answer.

The check is restricted to the very first content delta (_first_content_sent is
False) to avoid misidentifying models that happen to write "<think>" mid-answer.

Fixes #2225
Related: #2420 (covered by separate PR from @AmmarS-Analyst), #2224 (@RaresKeY)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace inert _thinking_model flag with _in_think_tag state machine

The original auto-detect set _thinking_model=True on the first <think> chunk
but still emitted it as a regular delta and set _first_content_sent=True
immediately, so no subsequent chunk could enter the repair path.

Replace with _in_think_tag bool: enter thinking mode when first content starts
with <think>, route all chunks to the thinking channel until </think> is found,
then the tail becomes the first regular delta. Adds three regression tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace _first_content_sent guard with _think_open_stripped

Opening-tag stripping used `not _first_content_sent` as the guard, but
_first_content_sent stays False throughout the entire think block (it only
flips when regular content is emitted). So `find(">")` ran on every
reasoning chunk — not just the first — and silently truncated everything
before the first ">" in any reasoning text containing comparisons, arrows,
or code.

Fix: add `_think_open_stripped = False` alongside `_in_think_tag`. Use it
as the strip guard in both the "still inside <think>" path and the
"</think> found in same chunk" split path. Set it True once the opening
tag is consumed so all subsequent chunks reach the thinking channel
unmolested.

Add regression test: 3-chunk stream where the middle chunk contains
"c > d" — confirms "more c " is not dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:18:19 +02:00
Alexandre Teixeira
0ead3a4eb2 fix(tests): isolate compare endpoint owner-scope test
Removes module-level core.database stubbing from the compare endpoint owner-scope regression test and patches ModelEndpoint per test with monkeypatch. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 19:17:15 +01:00
Zen0-99
7188737294 fix(hwfit): filter non-GGUF models on Windows (#2530)
Odysseus only supports llama.cpp on Windows (vLLM/SGLang are
explicitly blocked). llama.cpp requires GGUF, so AWQ/GPTQ/FP8
safetensors models without a GGUF alternate should not be
recommended in the Cookbook on Windows hosts.

Changes:
- hardware.py: add 'platform': 'windows' to _detect_windows()
  so downstream logic can identify Windows hosts.
- fit.py: include is_windows in the existing GGUF-only filter
  alongside apple_silicon and consumer_amd.
- tests: add test_hwfit_windows.py with regression tests.

Fixes #122, #614 (root cause: unservable models recommended).
2026-06-04 20:02:13 +02:00
Afonso Coutinho
abe04436a0 fix: merge-last-assistant deletes tool/system rows from the DB (history desync) (#1929) 2026-06-04 19:47:08 +02:00
Alexandre Teixeira
40cbfb7b94 fix(tests): align gallery owner filter null-user expectation
Updates the stale gallery owner-filter null-user test to match current single-user/auth-disabled behavior. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:39:45 +01:00
RaresKeY
c12c2aa233 fix: normalize Gemma 4 thought-channel output (#2224) 2026-06-04 19:26:58 +02:00
Alexandre Teixeira
7ce6ec7f50 fix(tests): use line-level PDF marker assertion
Updates the PDF marker regression test to check corrupted markers at line level instead of using a broad substring assertion. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:20:41 +01:00
Alexandre Teixeira
8bc16ef245 fix(tests): use non-repeating split chunk fixture
Updates the split_chunks containment regression test to use deterministic non-repeating records instead of a repeating fixture that could produce accidental substring matches. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:11:42 +01:00
nubs
050283c145 fix(mcp): confine oauth file paths (#2272) 2026-06-04 19:10:23 +02:00
nubs
935eb05c63 refactor(search): make src analytics a service shim (#2264) 2026-06-04 18:57:24 +02:00
Alexandre Teixeira
3b292403dc fix(tests): accept verify in endpoint HTTP mocks
Updates endpoint/model-route test HTTP mocks to accept the verify keyword argument passed by endpoint probing code. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 17:53:18 +01:00
Kenny Van de Maele
1f00fff837 feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)
Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
2026-06-04 18:37:32 +02:00
Kenny Van de Maele
7443c36bd9 feat: Add edit_file tool + file-change diffs (#1239)
* Add edit_file tool + file-change diffs

edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.

Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
  - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
    so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
    refuses it for non-admin owners.
  - confined by the same path policy as read_file/write_file (allowlist +
    sensitive-file deny) via _resolve_tool_path.

Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.

Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.

Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.

* Drop redundant import os in write_file closure

os is already imported at module top.
2026-06-04 18:29:10 +02:00
Kenny Van de Maele
8bfd79fe8e chore: deduplicate src/search modules (cache, content, query) into shims (#2506)
* chore: dedupe src/search/cache.py into a re-export shim

src/search/cache.py was a byte-identical copy of services/search/cache.py.
Convert it to a sys.modules alias of the canonical services module (matching
src/search/core.py, providers.py, ranking.py) so the two cannot drift, and add
an identity assertion to test_search_module_consolidation.py.

content.py and query.py are intentionally left as-is: the copies have drifted
and services lacks fixes that src has, so they need services reconciled first
before they can be shimmed safely.

* chore: dedupe src/search content.py and query.py into shims

Convert src/search/content.py and query.py to sys.modules aliases of the
canonical services/search/* (matching cache.py, core.py, providers.py,
ranking.py) so the duplicate copies cannot drift.

Repoint the two tests that were coupled to the src-copy internals onto the
canonical services surface (behaviour is equivalent):
- test_src_search_query_nonstring.py: import services.search.query instead of
  loading the src file by path.
- test_security_regressions.py::test_web_fetch_guard_blocks_redirect_into_private:
  mock httpx.get (services uses the module-level get, not httpx.Client) and
  assert on the canonical 'Blocked' message.

Drop the now-redundant [src_content, service_content] parametrization in
test_search_content_extraction_parity.py and test_search_content_url_guards.py
(after the shim both params are the same object); add content/query identity
assertions to test_search_module_consolidation.py.
2026-06-04 18:10:55 +02:00
Nicholai
c916224510 feat(memory): add provider interface (#72) 2026-06-04 16:26:11 +01:00
Alexandre Teixeira
dd1fa7e1c4 refactor(tests): add shared CLI test helpers
Adds shared test helpers for CLI script loading and scoped core.database stubs, then converts a low-conflict pilot set of CLI tests. Part of #2523.
2026-06-04 15:44:25 +01:00
raf
d3e6935d62 fix(tests): update search service mock to match current API signature (#2334)
comprehensive_web_search now called with (query, max_pages, return_sources)
and returns a tuple (_context, results). The test mock still used the old
async signature with max_results/fetch_content and returned a plain list,
causing TypeError on every run.

Fixes #2331
2026-06-04 14:19:51 +01:00
Giuseppe
bc9104efe2 fix: SSE stream parser crashes with NoneType on providers sending null choice/usage/tc entries (#2389)
* fix: SSE parser crashes with NoneType on MiniMax-M3 (and any provider sending null choice/usage/tc)

Three guards added in stream_llm:

1. choices[0] null check — MiniMax (and some other providers) send a
   choices entry as None. `_choices[0].get("delta")` raised
   AttributeError. Now checks `_choices[0] is not None` before calling
   .get().

2. usage null guard — j["usage"] can arrive as None (not a dict) on
   some providers. Added `or {}` so subsequent .get() calls don't crash.

3. tool_calls null entry skip — individual entries in the tool_calls
   array can be None. Added `if tc is None: continue` before
   tc.get("function").

All three match the `or {}` / null-guard pattern used elsewhere in the
same block. Safe for all OpenAI-compatible providers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard null choice in elif-choices SSE branch

The usage-chunk path already guarded _choices[0] is not None, but the
elif "choices" branch that processes content/tool-call deltas did not.
A chunk like {"choices": [null]} or {"choices": [null], "usage": null}
reaches j["choices"][0].get("delta") and crashes with:

    'NoneType' object has no attribute 'get'

Fix: extract choices[0] into _c0 and continue to the next chunk when
it is None, matching the guard already applied in the usage path.

Adds three focused regressions covering the paths the maintainer flagged:
- {"choices": [null]}
- {"choices": [null], "usage": null}
- tool_calls array containing a null entry alongside a valid call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 13:53:10 +01:00
Joeseph Grey
fa1fe7f866 security: sanitize rendered research-report HTML (#364)
The visual research report is assembled from LLM output over crawled web
pages (untrusted content) and served under a relaxed `script-src
'unsafe-inline'` CSP. Two values reached that HTML without sanitization:

- `_md_to_html` rendered the report markdown via python-markdown, which
  passes raw HTML through verbatim, so `<script>` / `<img onerror>` /
  `<svg onload>` / `javascript:` links carried in crawled content ran in
  the app origin.
- `category` (from the /api/research/start request body, no enum check) was
  interpolated raw into `<body class="category-{category}">`.

Allowlist-sanitize the rendered markdown with nh3, keeping the formatting
the report emits (tables, code, details/summary, toc anchors, codehilite
classes, external-link target/rel) while dropping active content, and
html.escape the category. Adds regression tests.
2026-06-04 13:42:49 +01:00
Alexander Kenley
7b45a94b6d Fix calendar routing and user-local time context (#408)
* fix(chat): add user-local time context

* fix(chat): route calendar follow-up phrasing

* refactor(chat): log tool intent routing reasons

* test(chat): align user time prompt shim

---------

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-04 13:20:04 +01:00
tanmayraut45
f59edee611 Support extra CA bundle for private-CA LLM providers (#769)
Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a
model endpoint fails on first probe with

    CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate
    chain (_ssl.c:1000)

because their TLS chain is signed by a private root CA (Russian Trusted
Root CA for GigaChat; corporate CA for on-prem) that isn't part of the
default system / certifi trust store. The endpoint shows offline in
the picker even though the URL and API key are correct (issue #722).

The right fix is to extend the trust store, not to weaken verification.
This change:

- src/tls_overrides.py: new module that resolves an opt-in env var
  LLM_CA_BUNDLE at import time, builds a shared SSLContext via
  ssl.create_default_context() (so the system / certifi bundle is
  loaded first) and layers the operator's PEM on top with
  load_verify_locations(). Exposes llm_verify() returning a value
  suitable for httpx `verify=`. Defaults to True (httpx built-in
  trust) when the env var is unset, when the file is missing, or
  when the PEM fails to load — verification is never silently
  disabled, the warning is logged and we fall back to the safe path.

- src/llm_core.py: thread llm_verify() into the shared AsyncClient
  used by stream_llm / streaming completions.

- routes/model_routes.py: thread llm_verify() into the five httpx.get
  call sites in _probe_endpoint / _ping_endpoint so adding a
  private-CA endpoint goes green on the very first probe and the
  picker stops showing it offline.

- .env.example: document LLM_CA_BUNDLE with the GigaChat case as the
  concrete example.

Deliberately NOT included: a verify=False knob (global or per-host).
Disabling verification exposes the affected endpoint to MITM, and the
operator-supplied bundle is the correct fix for legitimate private-CA
providers — so the only switch in this PR is the safe one.

Closes #722.
2026-06-04 13:18:50 +01:00
SHORYA BAJ
f876fc7704 fix(cookbook): don't mark successful dependency installs as crashed (#1315)
Pip dependency installs are tracked as download tasks but finish with the
runner's "=== Process exited with code 0 ===" sentinel and pip's
"Successfully installed" line — never the HuggingFace download markers
(DONE / 100% / /snapshots/ / DOWNLOAD_OK) the download heuristics look for.

Once the tmux pane is gone, the backend's only completion check is the HF
cache lookup, which a pip package (e.g. llama-cpp-python[server], no "/")
never matches, so it reports "stopped" — and the frontend maps a stopped
download to "crashed". The reconnect loop's session-gone heuristic had the
same gap. Result: a clean install (exit 0) showed "crashed" in the Running
tab while the Dependencies tab correctly showed it installed.

Add a shared _depInstallSucceeded() helper that keys off the exit-0
sentinel (falling back to pip's success line, rejecting ERROR/Traceback)
and wire it into both the session-gone heuristic and the background status
reconciler, gated on payload._dep so real model downloads are unaffected.

Also fixes the pre-existing test_background_status_poll_reconciles_into_local_tasks
assertion that no longer matched the evolved reconciler, and adds regression
coverage for both paths.
2026-06-04 12:55:06 +01:00
ghreprimand
28c43121d7 Fix session export 500 on multimodal/None message content (#1984)
txt/html/md export joined and string-munged message.content directly, so a
multimodal turn (content is a list of blocks) crashed export with a TypeError
on join (txt) / AttributeError on .replace (html), and None content (tool-only
assistant turns) rendered as the literal 'None'. Add a _content_to_text helper
that flattens string/list/None to plain text and apply it at the three export
sites. JSON export is unchanged (it serializes structured content correctly).
Plain-string content is returned unchanged, so existing exports are identical.

Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-04 12:53:44 +01:00
Alexandre Teixeira
f2b11ba94e tools: add read-only PR blocker audit helper
Adds a standalone read-only PR blocker audit helper with Markdown, terminal, and JSON output plus focused tests and documentation.
2026-06-04 12:51:48 +01:00
Giuseppe
68cb715914 fix(endpoint): import ModelEndpoint from core database
ModelEndpoint is defined in core.database, not src.database. The wrong
import silently prevented the module from loading in deployment
configurations that do not have a src/database.py shim, resulting in an
ImportError at startup.

Also adds a warning log when resolve_endpoint finds no usable model
(all models hidden or the list is empty), making the otherwise-silent
failure visible in operator logs.

The test_auth_regressions stub for src.endpoint_resolver was missing the
build_models_url attribute, which caused test collection errors.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 11:51:47 +01:00
Marius Popa
dc365a1b27 Fix Ollama agent single-token responses (#1591)
Agent mode treated local /v1 endpoints, including Ollama on :11434, as native-tool-capable by host/model heuristics. On Ollama's OpenAI-compatible surface some models that advertise tool support stop after a single token when schemas are sent (issue #1567). Default local Ollama /v1 back to fenced tool blocks unless the endpoint explicitly has supports_tools=True.

Also compare both the runtime chat URL and the normalized endpoint base when reading ModelEndpoint.supports_tools. That keeps a saved base URL such as http://localhost:11434/v1 effective when the active session URL is /v1/chat/completions.

Tests: .venv/bin/python -m pytest tests/test_tool_support_heuristic.py
2026-06-04 11:45:10 +01:00
Wes Huber
965185c6f9 fix(tests): pre-import real sqlalchemy/database in conftest to prevent stub contamination (#2398)
Some test files (e.g. test_llm_core_sanitize_tool_calls) stub
sqlalchemy and core.database at module level with
`if mod not in sys.modules`. During pytest collection these stubs
fire before the real modules are imported, contaminating every
subsequent test that needs real ORM objects (IntegrityError, missing
columns, etc.).

Pre-import the real modules in conftest.py so the module-level
guards find them already loaded and skip the stubs. Fixes ~10+
cascading test failures that only appear in the full suite.

Fixes #2395

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 11:39:49 +01:00
ooovenenoso
e163384015 fix: treat Nix files as readable uploads (#2249) 2026-06-04 12:06:24 +02:00
Nicholai
4dc11cfe6b refactor(memory): canonicalize memory imports (#50) 2026-06-04 05:31:15 +01:00
Yuri
a2e691da2b fix(models): stabilize proxy endpoint refresh behavior
* fix: support large proxy model endpoint refresh

Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.

Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.

Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.

* fix: preserve endpoint ping status semantics
2026-06-04 04:56:11 +01:00
Afonso Coutinho
09fe308720 fix(auth): revoke API tokens when deleting users
* fix: revoke API bearer tokens when their owner is deleted

* Re-run CI

* Invalidate bearer-token cache on user delete so warmed cached tokens stop working
2026-06-04 04:44:34 +01:00
Marius Popa
666babfd58 fix(documents): refresh library counters after removal (#1924) 2026-06-04 04:42:23 +01:00
Rudy Wolf
1c43daa564 fix(compare): stop blind mode leaking model identities via session names (#1318)
Blind Compare anonymized the pane headers, but each pane still created a helper chat session named "[CMP] <real-model>" and GET /api/sessions returned the session's model field. So the sidebar and the session-list API let a user map "Model A" back to its real model before voting, defeating the blind test.

- Frontend (static/js/compare/index.js, panes.js): in blind mode, name helper sessions by their neutral slot ("[CMP] Model A") instead of the model, matching the existing blind pane labels.
- Backend GET /api/sessions (routes/session_routes.py): blank the model field for [CMP]-prefixed helper sessions via a new _public_model helper.
- Backend /api/compare/start (routes/compare_routes.py): name blind sessions by slot and withhold model_left/model_right/mapping from the blind response (revealed at /vote).
- Tests: tests/test_blind_compare_redaction.py.

Fixes #1285.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 04:39:01 +01:00
hawktuahs
3d8c364689 [Bash] Fix Windows cookbook background tasks (#676)
* Fix Windows cookbook background tasks

* Add Windows Cookbook reliability follow-ups
2026-06-04 04:30:01 +01:00
Afonso Coutinho
49c14af5c7 fix(calendar): scope CalDAV event lookup by calendar
* fix: CalDAV sync hijacks another user's event sharing a VEVENT uid

* Seed schema-valid dtstart/dtend in caldav uid-scope test fixture
2026-06-04 04:01:21 +01:00
.bulat
e340674c12 Persist user prefs atomically (#1840) 2026-06-04 03:55:22 +01:00
lekt8
ceb62385f1 Fetch full messages with BODY.PEEK[] so read_email works on iCloud IMAP (#1961) (#1963)
read_email, reply_to_email and download_attachment fetched the full message with
the legacy bare RFC822 item (UID FETCH <uid> (RFC822)). iCloud's IMAP server
silently ignores it — the fetch returns status OK but only (UID <uid>) with no
body tuple, so the parse reports 'Email not found with UID' even though the
message exists and list_emails (which uses RFC822.HEADER) shows it. Gmail honours
(RFC822), which is why it only reproduced on iCloud.

Switch the three full-message fetches to (BODY.PEEK[]), which iCloud and Gmail
both honour and which doesn't set \Seen. Response shape is unchanged (raw bytes
still at msg_data[0][1]), so parsing is unaffected; the RFC822.HEADER (listing)
and (UID) probe fetches are left as-is.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 03:53:14 +01:00
Ocean Bennett
a38df08a31 fix(tests): use current python for rag id stability (#1817) 2026-06-04 03:49:59 +01:00