Commit Graph

250 Commits

Author SHA1 Message Date
Kenny Van de Maele
1cd0aa2b8c feat(provider): add GitHub Copilot provider with device-flow auth (#1480)
* feat(provider): add GitHub Copilot provider with device-flow auth

Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5,
Claude, Gemini, …) work through the normal chat + agent loop, incl. native
tool calling and vision.

Auth is one-click via the GitHub OAuth device flow; the access token is stored
as the endpoint's (encrypted) api_key and sent directly as `Authorization:
Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to
the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only
provider-specific behaviour is a small set of required request headers,
injected centrally.

Sign-in is available from Settings → model endpoints ("Connect GitHub
Copilot") and from chat via `/setup copilot`.

- src/copilot.py (new), routes/copilot_routes.py (new): constants, header
  builders, device-flow start/poll, model discovery, owner-scoped endpoint
  provisioning.
- src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers,
  per-request x-initiator/vision.
- src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas.
- src/model_context.py: known context windows for Copilot (no unauthenticated
  /models probe).
- static/, README, tests/test_copilot*.py.

* Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process
2026-06-04 21:13:14 +02:00
ooovenenoso
ab5311c44d fix(research): support timeout defaults in direct tests (#2624)
fix(research): honor planning query timeouts
2026-06-04 20:23:17 +02:00
Giuseppe
6d511f6e66 fix(llm): auto-detect <think> in content stream for unregistered thinking models (#2588)
* fix(llm): auto-detect <think> in content stream for unregistered thinking models

_THINKING_MODEL_PATTERNS only covers known model families by name. Qwen3-derived
models with non-standard names (e.g. Qwopus, custom QwQ forks) are not matched,
so their <think>...</think> content streams through as visible chat text instead
of being routed to the thinking display.

When the first content delta opens with <think> and the model was not already
identified as a thinking model, dynamically flag the stream as a thinking model
for the remainder of the response. This enables the existing </think> repair path
(line below) and ensures the frontend receives the full <think>...</think> wrapper
it needs to split thinking from the final answer.

The check is restricted to the very first content delta (_first_content_sent is
False) to avoid misidentifying models that happen to write "<think>" mid-answer.

Fixes #2225
Related: #2420 (covered by separate PR from @AmmarS-Analyst), #2224 (@RaresKeY)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace inert _thinking_model flag with _in_think_tag state machine

The original auto-detect set _thinking_model=True on the first <think> chunk
but still emitted it as a regular delta and set _first_content_sent=True
immediately, so no subsequent chunk could enter the repair path.

Replace with _in_think_tag bool: enter thinking mode when first content starts
with <think>, route all chunks to the thinking channel until </think> is found,
then the tail becomes the first regular delta. Adds three regression tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace _first_content_sent guard with _think_open_stripped

Opening-tag stripping used `not _first_content_sent` as the guard, but
_first_content_sent stays False throughout the entire think block (it only
flips when regular content is emitted). So `find(">")` ran on every
reasoning chunk — not just the first — and silently truncated everything
before the first ">" in any reasoning text containing comparisons, arrows,
or code.

Fix: add `_think_open_stripped = False` alongside `_in_think_tag`. Use it
as the strip guard in both the "still inside <think>" path and the
"</think> found in same chunk" split path. Set it True once the opening
tag is consumed so all subsequent chunks reach the thinking channel
unmolested.

Add regression test: 3-chunk stream where the middle chunk contains
"c > d" — confirms "more c " is not dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:18:19 +02:00
Giuseppe
dd707ddb1e fix(agent): default bash/python cwd to data/ to prevent ephemeral file loss (#2586)
Agent subprocesses (bash, python) previously inherited the container's default
working directory (/app), so files created with relative paths landed in the
ephemeral container layer and were silently destroyed on any docker compose up
--build or container recreation.

Set cwd=_AGENT_WORKDIR (resolved to <repo_root>/data at import time) and
HOME=_AGENT_WORKDIR on both subprocess launchers so that:
- pwd inside a bash tool returns the persistent data directory
- relative paths and ~ resolve to a location that survives rebuilds
- the agent can still cd to any absolute path it needs

The resolution uses pathlib.Path(__file__).parent.parent / "data", which
works for both Docker (/app/src → /app/data) and manual installs
(<repo>/src → <repo>/data) without requiring a new env var or compose change.

Fixes #2512

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:16:04 +02:00
Giuseppe
531f426557 fix: KeyError on missing 'content' key in system messages (#2362)
A system message that arrives without a 'content' key — possible via
malformed tool results — raised a KeyError in the hot path of llm_call,
llm_call_async, and stream_llm. Replace m["content"] with
m.get("content") or "" in all three functions so a missing key degrades
to an empty string instead of crashing.

Also removes a redundant .rstrip() after .strip() in _model_activity_key.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:38:45 +02:00
Giuseppe
ff8f9f2188 fix: llm_call_async does not retry on HTTP 429/502/503/504 (#2364)
The retry loop raised immediately for any non-success HTTP response
regardless of attempt count. For transient upstream errors (rate limit,
bad gateway, gateway timeout) the function should back off and retry
within the existing attempt budget.

Also lets ConnectError / ConnectTimeout retry when the host has not been
cooled and attempts remain, instead of always raising on the first
connect failure.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:35:55 +02:00
RaresKeY
c12c2aa233 fix: normalize Gemma 4 thought-channel output (#2224) 2026-06-04 19:26:58 +02:00
nubs
935eb05c63 refactor(search): make src analytics a service shim (#2264) 2026-06-04 18:57:24 +02:00
Kenny Van de Maele
1f00fff837 feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)
Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
2026-06-04 18:37:32 +02:00
Kenny Van de Maele
7443c36bd9 feat: Add edit_file tool + file-change diffs (#1239)
* Add edit_file tool + file-change diffs

edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.

Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
  - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
    so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
    refuses it for non-admin owners.
  - confined by the same path policy as read_file/write_file (allowlist +
    sensitive-file deny) via _resolve_tool_path.

Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.

Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.

Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.

* Drop redundant import os in write_file closure

os is already imported at module top.
2026-06-04 18:29:10 +02:00
Kenny Van de Maele
8bfd79fe8e chore: deduplicate src/search modules (cache, content, query) into shims (#2506)
* chore: dedupe src/search/cache.py into a re-export shim

src/search/cache.py was a byte-identical copy of services/search/cache.py.
Convert it to a sys.modules alias of the canonical services module (matching
src/search/core.py, providers.py, ranking.py) so the two cannot drift, and add
an identity assertion to test_search_module_consolidation.py.

content.py and query.py are intentionally left as-is: the copies have drifted
and services lacks fixes that src has, so they need services reconciled first
before they can be shimmed safely.

* chore: dedupe src/search content.py and query.py into shims

Convert src/search/content.py and query.py to sys.modules aliases of the
canonical services/search/* (matching cache.py, core.py, providers.py,
ranking.py) so the duplicate copies cannot drift.

Repoint the two tests that were coupled to the src-copy internals onto the
canonical services surface (behaviour is equivalent):
- test_src_search_query_nonstring.py: import services.search.query instead of
  loading the src file by path.
- test_security_regressions.py::test_web_fetch_guard_blocks_redirect_into_private:
  mock httpx.get (services uses the module-level get, not httpx.Client) and
  assert on the canonical 'Blocked' message.

Drop the now-redundant [src_content, service_content] parametrization in
test_search_content_extraction_parity.py and test_search_content_url_guards.py
(after the shim both params are the same object); add content/query identity
assertions to test_search_module_consolidation.py.
2026-06-04 18:10:55 +02:00
Nicholai
c916224510 feat(memory): add provider interface (#72) 2026-06-04 16:26:11 +01:00
Wes Huber
93b3e108a6 fix: re-export _SPORTS_HINT_RE from search ranking shim (#2273)
The compatibility re-export shim at src/search/ranking.py forgot
_SPORTS_HINT_RE, so tests importing src.search.ranking raised
AttributeError on the [src] parametrize variant.

Fixes #1995

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 14:24:53 +01:00
Giuseppe
bc9104efe2 fix: SSE stream parser crashes with NoneType on providers sending null choice/usage/tc entries (#2389)
* fix: SSE parser crashes with NoneType on MiniMax-M3 (and any provider sending null choice/usage/tc)

Three guards added in stream_llm:

1. choices[0] null check — MiniMax (and some other providers) send a
   choices entry as None. `_choices[0].get("delta")` raised
   AttributeError. Now checks `_choices[0] is not None` before calling
   .get().

2. usage null guard — j["usage"] can arrive as None (not a dict) on
   some providers. Added `or {}` so subsequent .get() calls don't crash.

3. tool_calls null entry skip — individual entries in the tool_calls
   array can be None. Added `if tc is None: continue` before
   tc.get("function").

All three match the `or {}` / null-guard pattern used elsewhere in the
same block. Safe for all OpenAI-compatible providers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard null choice in elif-choices SSE branch

The usage-chunk path already guarded _choices[0] is not None, but the
elif "choices" branch that processes content/tool-call deltas did not.
A chunk like {"choices": [null]} or {"choices": [null], "usage": null}
reaches j["choices"][0].get("delta") and crashes with:

    'NoneType' object has no attribute 'get'

Fix: extract choices[0] into _c0 and continue to the next chunk when
it is None, matching the guard already applied in the usage path.

Adds three focused regressions covering the paths the maintainer flagged:
- {"choices": [null]}
- {"choices": [null], "usage": null}
- tool_calls array containing a null entry alongside a valid call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 13:53:10 +01:00
Joeseph Grey
fa1fe7f866 security: sanitize rendered research-report HTML (#364)
The visual research report is assembled from LLM output over crawled web
pages (untrusted content) and served under a relaxed `script-src
'unsafe-inline'` CSP. Two values reached that HTML without sanitization:

- `_md_to_html` rendered the report markdown via python-markdown, which
  passes raw HTML through verbatim, so `<script>` / `<img onerror>` /
  `<svg onload>` / `javascript:` links carried in crawled content ran in
  the app origin.
- `category` (from the /api/research/start request body, no enum check) was
  interpolated raw into `<body class="category-{category}">`.

Allowlist-sanitize the rendered markdown with nh3, keeping the formatting
the report emits (tables, code, details/summary, toc anchors, codehilite
classes, external-link target/rel) while dropping active content, and
html.escape the category. Adds regression tests.
2026-06-04 13:42:49 +01:00
Massab K.
594775dc4b Fix issue 135 chat context bleed (#281)
* Fix issue 135 chat context bleed

* Guard task delivery metadata access
2026-06-04 13:27:46 +01:00
Alexander Kenley
7b45a94b6d Fix calendar routing and user-local time context (#408)
* fix(chat): add user-local time context

* fix(chat): route calendar follow-up phrasing

* refactor(chat): log tool intent routing reasons

* test(chat): align user time prompt shim

---------

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-04 13:20:04 +01:00
tanmayraut45
f59edee611 Support extra CA bundle for private-CA LLM providers (#769)
Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a
model endpoint fails on first probe with

    CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate
    chain (_ssl.c:1000)

because their TLS chain is signed by a private root CA (Russian Trusted
Root CA for GigaChat; corporate CA for on-prem) that isn't part of the
default system / certifi trust store. The endpoint shows offline in
the picker even though the URL and API key are correct (issue #722).

The right fix is to extend the trust store, not to weaken verification.
This change:

- src/tls_overrides.py: new module that resolves an opt-in env var
  LLM_CA_BUNDLE at import time, builds a shared SSLContext via
  ssl.create_default_context() (so the system / certifi bundle is
  loaded first) and layers the operator's PEM on top with
  load_verify_locations(). Exposes llm_verify() returning a value
  suitable for httpx `verify=`. Defaults to True (httpx built-in
  trust) when the env var is unset, when the file is missing, or
  when the PEM fails to load — verification is never silently
  disabled, the warning is logged and we fall back to the safe path.

- src/llm_core.py: thread llm_verify() into the shared AsyncClient
  used by stream_llm / streaming completions.

- routes/model_routes.py: thread llm_verify() into the five httpx.get
  call sites in _probe_endpoint / _ping_endpoint so adding a
  private-CA endpoint goes green on the very first probe and the
  picker stops showing it offline.

- .env.example: document LLM_CA_BUNDLE with the GigaChat case as the
  concrete example.

Deliberately NOT included: a verify=False knob (global or per-host).
Disabling verification exposes the affected endpoint to MITM, and the
operator-supplied bundle is the correct fix for legitimate private-CA
providers — so the only switch in this PR is the safe one.

Closes #722.
2026-06-04 13:18:50 +01:00
Giuseppe
f6a5f6592f fix: log warnings on silently swallowed agent and endpoint failures (#2367)
get_builtin_overrides() was swallowing all exceptions with a bare
`except Exception: pass`, so misconfigured tool-description overrides
would silently produce wrong agent behaviour with no log trace.

The background endpoint refresh loop had the same pattern: any probe
failure was silently ignored, giving operators no signal that the
refresh was broken.

Also removes a circular self-import (`from src.agent_loop import
_build_base_prompt`) inside _build_system_prompt; the function is
already in scope and the import created a latent circular reference risk.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 12:29:31 +01:00
Giuseppe
68cb715914 fix(endpoint): import ModelEndpoint from core database
ModelEndpoint is defined in core.database, not src.database. The wrong
import silently prevented the module from loading in deployment
configurations that do not have a src/database.py shim, resulting in an
ImportError at startup.

Also adds a warning log when resolve_endpoint finds no usable model
(all models hidden or the list is empty), making the otherwise-silent
failure visible in operator logs.

The test_auth_regressions stub for src.endpoint_resolver was missing the
build_models_url attribute, which caused test collection errors.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 11:51:47 +01:00
Marius Popa
dc365a1b27 Fix Ollama agent single-token responses (#1591)
Agent mode treated local /v1 endpoints, including Ollama on :11434, as native-tool-capable by host/model heuristics. On Ollama's OpenAI-compatible surface some models that advertise tool support stop after a single token when schemas are sent (issue #1567). Default local Ollama /v1 back to fenced tool blocks unless the endpoint explicitly has supports_tools=True.

Also compare both the runtime chat URL and the normalized endpoint base when reading ModelEndpoint.supports_tools. That keeps a saved base URL such as http://localhost:11434/v1 effective when the active session URL is /v1/chat/completions.

Tests: .venv/bin/python -m pytest tests/test_tool_support_heuristic.py
2026-06-04 11:45:10 +01:00
ooovenenoso
e163384015 fix: treat Nix files as readable uploads (#2249) 2026-06-04 12:06:24 +02:00
Nicholai
4dc11cfe6b refactor(memory): canonicalize memory imports (#50) 2026-06-04 05:31:15 +01:00
Yuri
a2e691da2b fix(models): stabilize proxy endpoint refresh behavior
* fix: support large proxy model endpoint refresh

Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.

Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.

Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.

* fix: preserve endpoint ping status semantics
2026-06-04 04:56:11 +01:00
Sushanth Reddy
eee2167502 Stop API key save() from writing other providers' keys as plaintext (#1944)
save() called load(), which DECRYPTS every stored key, then re-encrypted
only the key being saved and wrote the whole dict back. The other
providers' keys were thus persisted in plaintext; on the next load()
Fernet raised InvalidToken on them and they were silently dropped.

Add _load_raw() that returns the still-encrypted on-disk dict (reusing the
existing missing/corrupt-file guards) and have save() build on that, so
untouched providers keep their ciphertext. load() now also goes through
_load_raw(), keeping its behavior identical.

Fixes #1914

Co-authored-by: EkaTantra Dev <dev@ekatantra.com>
2026-06-04 04:47:13 +01:00
Afonso Coutinho
49c14af5c7 fix(calendar): scope CalDAV event lookup by calendar
* fix: CalDAV sync hijacks another user's event sharing a VEVENT uid

* Seed schema-valid dtstart/dtend in caldav uid-scope test fixture
2026-06-04 04:01:21 +01:00
Vykos
5f58f9a45f fix(ai): scope tool model resolution by owner
* Stabilize full test collection

* Scope AI tool model resolution by owner
2026-06-04 00:37:28 +01:00
Vykos
aaef6b1c49 fix(search): align content URL guards
* Stabilize full test collection

* Align search content URL guards
2026-06-04 00:34:06 +01:00
Vykos
193dc2f085 fix(uploads): bound direct upload reads
* Stabilize full test collection

* Add bounded reads for direct uploads
2026-06-04 00:32:50 +01:00
pewdiepie-archdaemon
089246614d feat: Claude Agent integration + cookbook reconnect + UI polish
- Claude Agent integration: AGENT_CONFIGS.claude, INTG_TYPES.claude,
  setup_claude_routes + integrations/claude/ skill bundle. Wired in
  app.py alongside the existing Codex integration; same scope-gated
  /api/codex/* backend; agent form has new description so users know
  it's setup for an external CLI, not an agent streamed inside Odysseus.
- Remove mark_email_boundaries action: not good enough yet. Stripped
  from task UI, scheduler defaults, registry, tool schema, clear-cache
  route. Added to RETIRED_HOUSEKEEPING_ACTIONS so existing rows + their
  task_runs auto-purge on startup.
- Cookbook download reliability: "Reconnect" fix button in the crash
  diagnosis runs _reconnectTask after probing has-session. 30s confirm
  window before marking a download "done" — kills the Finished/Downloading
  flicker when tmux briefly drops between captures.
- Mobile UX: tap anywhere on a note card body opens the editor;
  Update button morphs to Archive when no text was edited; bell icon
  accent-colored; chip-trashing notif pills fade so only the icon
  rotates into the trash zone.
- Settings integrations: SVG-per-provider in email + API preset
  dropdowns, custom drop-up-aware menus, accent sub-header icons
  (IMAP/SMTP), consistent card styling between list + edit, contacts
  Edit/Delete icons, agent form description copy.
2026-06-04 08:27:26 +09:00
pewdiepie-archdaemon
6861c41580 Reapply "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit cc8fe2f6e3.
2026-06-03 22:47:00 +09:00
pewdiepie-archdaemon
cc8fe2f6e3 Revert "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit 8161c1253d, reversing
changes made to 8c2705b42a.
2026-06-03 22:46:19 +09:00
Alexandre Teixeira
b1a4ed13b0 Harden API-token chat endpoint selection
Validate only token-supplied direct base_url values for API-token chat requests, while keeping admin-configured endpoints available for local/LAN providers.

Scope configured endpoint fallback selection to the API token owner, fail closed for unknown token owners, and preserve strict session ownership checks when resuming sessions from chat-scoped API tokens.

Add focused regression coverage for direct base_url SSRF rejection, configured endpoint fallback behavior, token-owner scoping, URL validation, and null-owner session/endpoint handling.
2026-06-03 13:05:13 +01:00
Alexandre Teixeira
a75dd4a231 fix(search): apply recency UTC fix to live ranking module 2026-06-03 12:49:32 +01:00
Shaw
49bf73b228 fix(forms): keep PDF-form export from dropping values when the label has '*' (#1407)
parse_markdown_to_values — the read-back path for export-pdf, the export
preview, and prepare-signed-reply — matched the bold field label with [^*]+, so
it could not match a label containing '*' (the near-universal required-field
marker: "Email *", "State *", "Signature *"). The value then stayed empty, so
the exported PDF and the signed-reply attachment came out blank for that field
with no error — a whole form of required fields could export completely empty.

Match the label non-greedily (.+?) so '*' in labels is tolerated while still
splitting at the first ':**' / '**[', which also preserves a value that itself
contains ':**'.

Adds tests/test_form_markdown_roundtrip.py (render -> parse roundtrip): asterisk
text/choice/signature labels survive (fail before, pass after); plain labels and
colon-bearing values are unaffected.

Co-authored-by: NubsCarson <nubs@nubs.site>
2026-06-03 14:24:07 +09:00
Afonso Coutinho
b55c970ec5 fix: sports-hint ranking penalty fires on 'transport'/'passport' substrings (#1473)
* fix: sports-hint ranking penalty fires on 'transport'/'passport' substrings

* Apply word-boundary sports-hint fix to src/search/ranking.py as well
2026-06-03 14:23:52 +09:00
Paulo Victor Cordeiro
1feb2ae7d5 fix: close AsyncExitStack on MCP init/tool-discovery failure (#1493)
If session.initialize() or list_tools() raises after the stdio
subprocess or SSE connection is already open, the AsyncExitStack is
never closed — leaking the child process or HTTP connection. Wrap the
setup phase in try/except to aclose() the stack before re-raising.
2026-06-03 14:23:46 +09:00
ghreprimand
8c4ea484a9 Cap inline attachment context across files (#1498)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 14:23:43 +09:00
Lucas Daniel
398892cced fix(settings): catch PermissionError in load_settings + error-path tests (#1570)
PermissionError was not in the except tuple so an unreadable settings.json
would crash the app instead of falling back to defaults. Added alongside the
existing FileNotFoundError/JSONDecodeError/ValueError catches.

Also adds test_settings_error_paths.py covering all four failure modes:
missing file, corrupted JSON, wrong type, and permission denied.
2026-06-03 14:23:27 +09:00
danielroytel
39848a168b fix: recognize Gemma 4 as a thinking model and add context entry (#1642)
Gemma 4 returns reasoning_content in streaming responses via
llama-server, but the model wasn't listed in _THINKING_MODEL_PATTERNS,
causing reasoning tokens to be mishandled. Add "gemma" to the pattern
list and register Gemma 4's 128K context window in KNOWN_CONTEXT_WINDOWS
so the agent loop budgets context correctly.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-03 14:23:18 +09:00
Afonso Coutinho
b45611e9c5 fix: _strip_reasoning_prose discards the answer when reasoning trails it (#1643) 2026-06-03 14:23:15 +09:00
Afonso Coutinho
3e33cf6439 Anchor shell-verb intent patterns to imperative or can-you position (#1664) 2026-06-03 14:23:10 +09:00
Afonso Coutinho
8a0b79bc84 fix: deep research runs the prompt's example queries when the model echoes them (#1666) 2026-06-03 14:23:07 +09:00
Afonso Coutinho
b396252af6 fix: monthly tasks scheduled for day 29-31 skip every short month (#1668) 2026-06-03 14:23:01 +09:00
Afonso Coutinho
1161040efe fix: visual report drops photos whose URL slug contains icon or logo (#1685) 2026-06-03 14:22:45 +09:00
Shaw
eb5727abda fix(agent): coerce non-object tool-call arguments instead of crashing (#1370)
A native function/tool call whose `arguments` field is valid JSON but not an
object — a bare array like ["ls -la"], or a string/number/bool/null — parsed
fine in function_call_to_tool_block and then every branch called args.get(...),
raising AttributeError ('list'/'str' object has no attribute 'get'). That
propagated out of the streamed agent loop (no surrounding try/except at the
call site in stream_agent_loop) and aborted the user's entire turn. Weaker and
local models routinely emit malformed args like this.

Coerce non-dict parsed arguments to {} (mirrors the existing empty-arguments
behavior), so the tool runs with empty args instead of killing the stream.

Adds tests/test_function_call_non_object_args.py covering array/string/number/
bool/null arguments — they fail before this change and pass after.
2026-06-03 14:14:37 +09:00
ghreprimand
41d2767b30 Replace task scheduler utcnow calls (#1456)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 14:14:30 +09:00
Marius Oppedal Ringsby
4f03f5ccdd Replace cleanup service datetime.utcnow calls (#1494)
datetime.utcnow() is deprecated in Python 3.12 and removed in 3.14.
Swap the five calls in src/cleanup_service.py for a local _utcnow()
helper returning naive UTC, matching the naive DateTime columns the
archive/delete cutoffs compare against (same approach as the
task-scheduler and core-database slices). Add a regression test
asserting the helper stays naive so the cutoff math can't hit a
naive/aware TypeError.

Part of #1116
2026-06-03 14:14:27 +09:00
ghreprimand
6fd52cf317 Replace webhook manager datetime.utcnow calls (#1499)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 14:14:23 +09:00
red person
56cd8add18 Fall back from invalid preset stores (#1402) 2026-06-03 14:12:31 +09:00