Commit Graph

230 Commits

Author SHA1 Message Date
lekt8
80de69ebb0 feat: document rrule in the manage_calendar tool schema (#1320) (#1324)
* feat: document rrule in the manage_calendar tool schema (#1320)

The create_event handler already persists `rrule` (a single event carrying an
iCalendar RRULE), but the manage_calendar tool schema didn't list it, so the
agent had no documented way to make a recurring event and took a roundabout
path. Add `rrule?` to the create_event field list with examples
(FREQ=WEEKLY;BYDAY=MO etc.) and an explicit note to create ONE event with the
rule rather than looping.

Covered by tests/test_calendar_rrule.py: do_manage_calendar create_event with an
rrule stores one event with that recurrence; without it, the event is single.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: restore SessionLocal via monkeypatch in #1320 rrule test (review)

Per review: the test patched core.database.SessionLocal at module import and
never restored it, which could leak the temp DB into later tests in the same
process. Move the patch into an autouse monkeypatch fixture so it is restored
after each test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:37:45 +09:00
Paulo Victor Cordeiro
97f855b40d fix: pass owner to start_research in chat stream path (#1265)
* fix: pass owner to start_research in chat stream path

Research launched from the chat stream omits the owner parameter,
causing those research sessions to never appear in the user's
research library (which filters by owner). All other start_research
call sites in this file already pass owner=_user.

* test: assert all start_research calls in chat_routes pass owner

Uses AST inspection to verify every start_research() call site
includes the owner= keyword argument, preventing regressions where
new call sites forget to scope research by user.
2026-06-03 02:32:38 +09:00
Vykos
5ee30cc144 Scope skills usage by owner (#1312) 2026-06-03 02:27:43 +09:00
Vykos
1adf21a7e5 Scope email account workflows by owner (#1309) 2026-06-03 02:21:02 +09:00
Vykos
e73545f64f Keep Bitwarden unlock password off argv (#1311) 2026-06-03 02:13:51 +09:00
Michael Gerber
e392be0d65 fix: Cookbook local GGUF serving inside Docker (#1264)
* fix: Cookbook local GGUF serving inside Docker

Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint.

Fixes included here:

use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub
fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts
preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal
Before this change, Docker local serves could fail in several ways:

Cookbook pointed llama.cpp at the wrong GGUF path
generated serve runner scripts crashed before launch with a shell syntax error
successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1
This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration

* test: add test for docker-local endpoint rewrites
2026-06-03 02:08:09 +09:00
Paulo Victor Cordeiro
5452bc96b1 fix: markdown table renders separator row as visible data (#1252)
* fix: markdown table renders separator row as visible data

The alignment separator (|---|---|) at row index 1 was rendered as a
<td> row with dashes as cell content. Skip it and only open <tbody>
at that point, so tables render as header + data without the garbage
separator row in between.

* test: add regression test for table separator row rendering

Verifies that the markdown table renderer skips the separator row
(|---|---|) instead of rendering it as a visible data row. Also
updates the test harness to handle the splitTableRow import.
2026-06-03 01:59:05 +09:00
Afonso Coutinho
a8395b4e4c fix: agent_input_token_budget wrongly treated as a secret and unsettable from chat (#1294)
* fix: don't classify agent_input_token_budget as a secret (token must be a suffix)

* test: agent_input_token_budget is settable from chat
2026-06-03 01:53:47 +09:00
lekt8
adde94e430 fix: closed document stays active & leaks into new chats (#1160) (#1238)
* fix: closed document no longer stays active and leaks into new chats (#1160)

Closing a document tab calls _detachDocFromSession: a doc with content is
PATCHed to session_id="" (unlinked, session_id -> NULL, is_active stays True),
an empty one is DELETEd. But the in-memory active-document pointer
(tool_implementations._active_document_id) was never cleared on either path.

The chat doc-injection last-resort looks up that pointer by id and injects it
when `not cand.session_id or cand.session_id == session`. An unlinked doc has
session_id NULL, so the stale pointer re-surfaced a closed document in later,
unrelated chats — the agent kept reading/suggesting edits to a doc the user
had closed.

Fix: add clear_active_document(doc_id) and call it when a document is unlinked
(PATCH session_id="") or deleted, so the pointer no longer resurrects a closed
document. clear_active_document only clears when the id matches (or no id), so a
different active doc is left untouched.

Covered by tests/test_active_document_clear.py (4 cases).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: add route-level regression for #1160 (detach/delete clears active doc)

Per review: prove the actual API path, not just the helper. Drives
PATCH /api/document/{id} (session_id="") and DELETE /api/document/{id}
through TestClient against a temp SQLite DB under real owner routing, and
asserts get_active_document() is cleared (and untouched when a different
document is closed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: make #1160 route regression hang-proof and dev-DB-independent

The route test could hang in other environments: it set DATABASE_URL at import
time, which is ignored if core.database was already imported, so it fell back to
the real dev DB and could contend for its locks (maintainer saw it hang, exit
124).

Rebind to a DEDICATED temporary SQLite engine (NullPool) and patch the document
route module's SessionLocal to it via an autouse fixture — so the test never
touches the dev DB and is independent of import order. Runs in ~0.3s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: drive #1160 route regression without TestClient (fixes local hang)

The route test used Starlette TestClient (middleware app + threadpool), which
hung in the maintainer's environment. Rework it to call the async route handlers
directly — extracted from the router — with a minimal fake request against a
temp-SQLite-patched SessionLocal. Same real coverage (handler + DB + owner
routing), but it completes reliably (~0.3s) with no TestClient/threadpool.

Verified the maintainer's exact batch now passes:
  pytest tests/test_document_close_clears_active_route.py \
         tests/test_active_document_clear.py \
         tests/test_document_tool_owner_scope.py  -> 14 passed

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 01:47:13 +09:00
lekt8
1507d140b8 feat: CalDAV write-back — push local event create/update/delete to the remote (#800) (#1282)
* feat: CalDAV write-back — push local event create/update/delete to the remote (#800)

CalDAV sync was pull-only (src/caldav_sync.py), so events created, edited, or
deleted in Odysseus on a CalDAV-backed calendar only changed local SQLite and
never reached the server — they silently vanished on the next pull and never
appeared on the user's phone (iCloud, etc.).

This adds the missing write half:
- src/caldav_writeback.py builds the VEVENT, re-discovers the remote calendar by
  the same URL-hash the local id was derived from (the remote URL isn't stored),
  and PUTs/DELETEs the event by UID via the caldav lib. The pure pieces
  (build_event_ical, find_remote_calendar, push_event) take inputs by argument so
  they unit-test against a fake client with no network.
- create/update/delete event handlers (routes/calendar_routes.py) call it
  best-effort for caldav-sourced calendars only: the local DB stays the source of
  truth, a remote failure is logged, never fatal, and local calendars are untouched.

Tests: tests/test_caldav_writeback.py (9, pure logic incl. iCal serialization,
hash discovery, create/update/delete orchestration) and
tests/test_caldav_writeback_route.py (3, route-level: a caldav calendar pushes,
a local one does not, delete pushes a delete). 12 passed.

Note: write-back re-discovers the remote calendar per write (the URL isn't
persisted locally); a follow-up could cache it. Live-iCloud verification needs a
real account — flagging for a maintainer pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: drive #800 route regression without TestClient (fixes local hang)

Same fix as the document route test: the CalDAV write-back route regression used
Starlette TestClient (middleware app + threadpool) which hung in the maintainer's
environment. Rework it to call the async create/delete calendar handlers directly
— extracted from the router — with a minimal fake request, temp-SQLite-patched
SessionLocal, and writeback_event stubbed to record calls. Same coverage (a
caldav calendar pushes, a local one does not, delete pushes a delete), completes
in ~0.3s with no TestClient.

Verified the maintainer's exact batch:
  pytest tests/test_caldav_writeback.py tests/test_caldav_writeback_route.py -> 12 passed

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 01:44:02 +09:00
Shreyas S Joshi
7504fedb17 fix: surface reasoning_content when content is empty (thinking models) (#1233)
Thinking models served via llama.cpp without --reasoning-format none
(e.g. Qwen3, DeepSeek-R1) route all tokens into reasoning_content and
return content="". Two call paths were silently broken:

- llm_call / llm_call_async (non-streaming): hard-keyed
  data["choices"][0]["message"]["content"] raises KeyError or returns
  empty string, discarding the entire response.

- stream_agent_loop end-of-round fallback: when full_response is empty
  but round_reasoning has content, the existing code replaced the
  response with the generic empty-response error message, discarding
  all reasoning tokens that were correctly accumulated during streaming.

Fix: in both non-streaming paths use msg.get("content") or
msg.get("reasoning_content") or "". In the streaming fallback, surface
round_reasoning as the answer before falling through to the error path.
2026-06-03 01:41:24 +09:00
nickorlabs
c39d8db12a fix(agent): make context-budget hard_max configurable via agent_input_token_hard_max setting (#1273)
Completes the reviewer requirement from PR #1190 review that was carried
over but not implemented in #1230:

> "The hard max is a function-local constant. For this setting, the ceiling
>  should be configurable or at least represented as a named setting/default
>  with tests."
                                                                — review on #1190

#1230 shipped the adaptive auto-derivation but left `DEFAULT_HARD_MAX = 200_000`
as a hardcoded module constant in src/context_budget.py. Admins on premium
APIs with large context windows (kimi-k2 / minimax-m3 at 1M, etc.) can use
their full window today only by setting `agent_input_token_budget`
explicitly — which then takes them off the adaptive auto-path entirely.

## What this PR changes

- src/settings.py: register `agent_input_token_hard_max` in
  DEFAULT_SETTINGS, default 200_000 (matches `DEFAULT_HARD_MAX`). Inline
  comment documents the no-op semantics in the explicit branch.

- src/agent_loop.py: read the setting at the call site and pass it as the
  `hard_max` kwarg of `compute_input_token_budget`. Defensive parsing —
  missing / non-int / zero values fall back to `DEFAULT_HARD_MAX`, so a
  misconfig cannot silently zero the budget.

- src/tool_implementations.py: three friendly aliases for `manage_settings`:
  - "hard max" -> agent_input_token_hard_max
  - "token budget cap" -> agent_input_token_hard_max
  - "input budget cap" -> agent_input_token_hard_max
  Plus the existing "token budget" -> agent_input_token_budget keeps a
  matching shorter alias "input budget".

- tests/test_context_budget.py: 6 new tests on top of the existing 6:
  - hard_max raises the auto ceiling (1M ctx + raised cap -> 85% of ctx)
  - hard_max lowers the auto ceiling (128K ctx + 50K cap -> 50K)
  - hard_max has no effect on the explicit branch
  - DEFAULT_SETTINGS contains the new key
  - manage_settings aliases are registered
  - the live get_setting path returns the override value, and malformed
    values fall back per the agent_loop defensive parsing

12 passed in 0.04s. No changes to the pure helper signature or semantics;
#1230's behavior is the default when the new setting is unset.

## How it lets users drop the explicit override

Before this PR, on a 1M-context model:
  agent_input_token_budget = 900_000  (explicit)  -> 900K  [user override]
  agent_input_token_budget = <unset>  (auto)      -> 200K  [HARD_MAX]

After this PR, same model:
  agent_input_token_budget = <unset>
  agent_input_token_hard_max = 900_000
                                      -> min(1M * 0.85, 900K) = 850K  [auto, no override needed]

The explicit-override path keeps working unchanged for users who prefer it.
2026-06-03 01:36:57 +09:00
Afonso Coutinho
926a4c59cb fix: 2FA bypassed when enabled but TOTP secret is missing (fail-open) (#1286)
* fix: fail closed when 2FA is enabled but the TOTP secret is missing

* test: totp_verify fails closed when secret missing, passes when 2FA off
2026-06-03 01:26:47 +09:00
Afonso Coutinho
65751186bd fix: merging consecutive user messages corrupts multimodal (image) content (#1277)
* fix: preserve multimodal content blocks when merging consecutive user messages

* test: consecutive user-message merge keeps multimodal image blocks
2026-06-03 01:21:57 +09:00
Afonso Coutinho
83aa35b83e fix: owner-less document query passes bare False to SQLAlchemy filter() (#1281)
* fix: use SQL false() for owner-less document query (filter(False) raises in SQLAlchemy 2.x)

* test: owner-less document query doesn't pass a bare False to filter
2026-06-03 01:20:43 +09:00
Afonso Coutinho
a3b3dbafde fix: uploaded files with no extension become permanently unresolvable (#1275)
* fix: accept extensionless upload ids so files like Dockerfile resolve

* test: upload id validation accepts extensionless ids
2026-06-03 01:16:30 +09:00
Afonso Coutinho
f62d6ea3d7 fix: research query misclassifies 'whatsapp'/'however' as questions (#1247)
* fix: detect question words as whole words, not prefixes

* fix: same question-word prefix bug in the services search copy

* test: question-word detection rejects prefix lookalikes
2026-06-03 01:10:06 +09:00
Afonso Coutinho
311f226d44 fix: calendar check-in digest drops events 7-8 days out (#1249)
* fix: close 1-day gap in calendar digest windows (events ~7-8 days out)

* test: calendar digest windows are contiguous and cover 7-8 day events
2026-06-03 01:03:58 +09:00
red person
aa420e2060 Ignore stale duplicate upload rows (#1256) 2026-06-03 00:59:01 +09:00
Afonso Coutinho
a04553013d fix: Anthropic responses with multiple text blocks lose all but the first (#1255)
* fix: concatenate all Anthropic text blocks, not just the first

* test: Anthropic response parsing concatenates text blocks
2026-06-03 00:57:20 +09:00
red person
a901992d03 Ignore non-object vault config (#1258) 2026-06-03 00:55:04 +09:00
Shreyas S Joshi
b29c200801 fix(mcp): invalidate tool prompt cache on connect/disconnect/error (#1235)
* fix(mcp): invalidate tool prompt cache on connect/disconnect/error

get_tool_descriptions_for_prompt cached its result keyed only on
(disabled_map, len(_tools)). If a server reconnects with the same
tool count (or transitions to error state), the cache was never
busted — the agent received stale tool descriptions for the new
connection state.

Add a _generation counter incremented on every structural change
(successful connect, disconnect, connection error) and include it in
the cache key.

* test(mcp): regression test for _generation cache invalidation
2026-06-03 00:49:29 +09:00
ghreprimand
77320b617f Fix owner-scoped skill updates (#1240)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 00:42:56 +09:00
Afonso Coutinho
3137ee4946 fix: theme color parsing breaks on #rgb shorthand hex (#1213)
* refactor: add pure hexToRgb helper that handles #rgb shorthand

* fix: handle #rgb shorthand hex in theme color parsing

* test: hexToRgb expands shorthand and rejects invalid input
2026-06-03 00:30:03 +09:00
Afonso Coutinho
203c4d83df fix: search analytics crashes recording when the JSON file predates a counter (#1224)
* refactor: single _default_analytics() instead of duplicated default dicts

* fix: merge analytics defaults so an old/partial file doesn't KeyError on record

* test: analytics load merges defaults; record survives a partial file
2026-06-03 00:26:37 +09:00
lekt8
975fd42e32 fix: rank recency by UTC, not local time (#1116) (#1234)
src/search/ranking.py computed result age as `(datetime.now() - dt).days`, where
`dt` is parsed from a UTC-style published date with no timezone. Using local
`datetime.now()` skewed the age by the host's UTC offset (off-by-up-to-a-day near
boundaries), and was a latent crash: once neighbouring code becomes timezone-aware
the naive/aware subtraction raises TypeError (the landmine called out in #1116).

Recency is now measured against naive UTC. The scoring is also lifted out of the
rank_search_results closure into a module-level, time-injectable `recency_score`
so it's unit-testable, and `_utcnow_naive()` avoids `datetime.utcnow()` (removed in
Python 3.14).

Covered by tests/test_search_ranking_recency.py (5 cases); the existing
tests/test_search_ranking.py still passes.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 00:18:15 +09:00
lekt8
8c376d2b0e feat: adapt agent_input_token_budget to the model context window (#1170) (#1230)
The agent soft-trims input context to `agent_input_token_budget` (default 6000).
The old computation `min(context_length or budget, budget)` made the 6000 default
a hard ceiling for every model, so 128K/1M context models were silently capped at
6000 input tokens — now that num_ctx is sent correctly (#1056), this was the last
barrier to actually using a long context window.

This derives the default budget from the model's discovered context window
(~85%, capped at a generous hard max) while honouring an explicit user setting
exactly (clamped to the window). When the window is unknown it falls back to the
previous value, so behaviour is unchanged for that case.

- src/context_budget.py: pure `compute_input_token_budget()` (unit-testable)
- src/settings.py: `is_setting_overridden()` to tell an explicit user value from
  the merged default (load_settings merges DEFAULT_SETTINGS, so equality alone
  can't distinguish them)
- src/agent_loop.py: use the helper in the soft-trim path

Covered by tests/test_context_budget.py (6 cases).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 00:13:53 +09:00
ghreprimand
1fda906407 Fix Cookbook container-local model endpoints (#1223)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 00:09:48 +09:00
ghreprimand
e72b9a8a95 Fix stale deleted sessions in sidebar (#1203)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 23:52:22 +09:00
lekt8
87babb58d5 fix: SSRF hardening for the custom embedding endpoint URL (#132) (#1206)
POST /api/embeddings/endpoint takes a user-supplied URL and immediately
makes an outbound httpx request to it with no validation. The admin gate
added earlier (PR #80) closed the unauthenticated-access part of #132; this
addresses the remaining request: validate the URL before fetching it.

Odysseus is local-first, so pointing the embedding endpoint at a loopback or
LAN server (local vLLM / llama.cpp / Ollama) is a normal setup — a blanket
private-IP block would break the primary use case. So the guard:

  - always rejects non-HTTP(S) schemes (file://, gopher://, ftp:// …),
  - always rejects the link-local range (169.254.0.0/16, incl. the cloud
    instance-metadata 169.254.169.254 exfil vector) plus multicast /
    reserved / unspecified, and IPv4-mapped-IPv6 forms of the above,
  - keeps loopback/LAN allowed by default, and
  - adds EMBEDDING_BLOCK_PRIVATE_IPS=true for full SSRF lockdown on exposed
    multi-tenant deployments.

Logic lives in src/url_safety.py (stdlib only, resolver injectable) so it is
unit-testable without real DNS; the route calls it before the health-check
request. Covered by tests/test_url_safety.py (8 cases).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 23:46:33 +09:00
red person
258e6fc0d4 fix(ui): allow manual prompt bar resize (#1201) 2026-06-02 23:43:53 +09:00
red person
42ae905df7 fix(models): clear deleted endpoint fallback refs (#1207) 2026-06-02 23:41:04 +09:00
red person
cc6e43da44 Report provider-specific search API keys correctly (#1202)
* fix(search): report provider-specific API keys

* fix(search): include provider env keys in status
2026-06-02 23:37:15 +09:00
lekt8
f2f437f4a8 feat: add /api/ready readiness probe (DB, data dir, local-first) (#1200)
/api/health is a liveness ping. This adds /api/ready as a readiness /
integrity self-check that returns 503 unless every critical subsystem is
whole, so an orchestrator (Docker/Compose/k8s) can gate traffic on real
readiness rather than mere process liveness:

  - database: opens a connection and runs SELECT 1
  - data_dir: confirms the data directory exists and is writable
  - local_first: reports whether storage stays on the host (informational;
    a remote database is a valid deployment, so it never fails readiness)

The check logic lives in src/readiness.py so it is unit-testable in
isolation; the route is a thin wrapper. Covered by tests/test_readiness.py.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 23:33:22 +09:00
red person
76a7685105 fix(models): clear stale speech endpoint settings (#1196) 2026-06-02 23:32:01 +09:00
red person
69ab350919 fix(ui): keep minimized windows above composer (#1197) 2026-06-02 23:31:09 +09:00
red person
0db441b191 fix(ui): contain email split divider (#1194) 2026-06-02 23:28:24 +09:00
Mayank Ukey
f96edfe5ca fix: deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent (#1169)
* fix: exclude deepseek from local tool-calling keyword list

deepseek-r1 on Ollama returns HTTP 400 when tool schemas are sent.
The cloud API (api.deepseek.com) is already caught by the _API_HOSTS
check, so the generic 'deepseek' keyword match was only causing false
positives for local Ollama-served models.

* fix: add model no-tools blocklist and regression tests for deepseek-r1

The previous fix removed 'deepseek' from the keyword allow-list, but
_is_api_model is still True for localhost endpoints because 'localhost'
appears in _API_HOSTS — so the keyword change had no effect for Ollama.

Proper fix: add an explicit _model_no_tools blocklist ('deepseek-r1')
that overrides the endpoint URL check. The endpoint's supports_tools DB
flag still takes priority either way (True forces tools on, False forces
them off), so users can override per-endpoint when needed.

Also refined the deepseek allow-list: 'deepseek-v' and 'deepseek-chat'
cover the cloud models (v2, v3, chat) that do support tools, without
matching deepseek-r1 variants.

13 regression tests cover:
- deepseek-r1 on localhost/docker: no tools (was HTTP 400)
- deepseek-v3/chat on api.deepseek.com: tools enabled (no regression)
- endpoint_supports=True/False overrides both lists
- qwen/llama on localhost: unaffected
2026-06-02 23:22:57 +09:00
spooky
f667667da3 fix: distinguish external cookbook runtimes (#1188) 2026-06-02 23:20:00 +09:00
PrabinDevkota
6b7dd4ea28 fix(auth): case-insensitive owner migration on username rename (#1183)
Use func.lower() when updating SQL owner columns, match prefs keys
case-insensitively, and normalize session usernames before comparing
during rename. Prevents silently skipping legacy mixed-case owner data.

Fixes #1165
2026-06-02 23:18:15 +09:00
spooky
5b87e69221 feat: add vllm kv cache dtype option (#1185) 2026-06-02 23:17:16 +09:00
ghreprimand
7b43fa9372 Improve calendar event text contrast (#1184)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 23:14:52 +09:00
Ernest Hysa
c12ae79c42 fix(tools): strict path confinement with sensitive-subpath deny list (#1072)
Rework read_file / write_file confinement after review feedback:

- Remove $HOME from default allow roots. Only project data/ and system
  temp dirs are allowed out of the box.
- Add a sensitive-subpath deny list (.ssh, .gnupg, shell rc files,
  .env, .netrc, SSH key filenames). Checked BEFORE allowlist so it
  blocks even when a broader root is configured.
- Add "tool_path_extra_roots" setting for opt-in broader access.
- Sensitive subpaths remain blocked regardless of configured roots.

Tests: 24 cases covering /etc/shadow, ~/.ssh/authorized_keys,
symlink into .ssh, traversal, shell rc files, key filenames,
extra roots, and dispatch-level end-to-end.
2026-06-02 23:13:30 +09:00
Shaw
16f7feee0a fix(hwfit): honor manual "metal" backend in the hardware simulator (#1090)
The Cookbook's manual hardware simulator ("what if I had this setup") let users
pick a backend, but _apply_manual_hardware only accepted cuda/rocm/cpu_x86/
cpu_arm and silently coerced anything else to cuda. So selecting Apple/Metal
simulated a CUDA box instead — and ranked safetensors-only repos a Mac can't
serve, even though the rest of hwfit (services.hwfit.fit, the serve-command
generation) already supports Metal as GGUF-only via llama.cpp/Ollama.

Add "metal" to the accepted backends (now a named _MANUAL_BACKENDS set, kept a
subset of what fit.py understands) and set unified_memory=True for it — Apple
Silicon shares one memory pool with the GPU — while clearing that flag for the
discrete (cuda/rocm) and CPU backends. _apply_manual_hardware is lifted to
module scope so it is directly unit-testable; both route call sites are
unchanged.

Adds tests/test_hwfit_manual_backend.py, including an end-to-end check that a
simulated Metal box only recommends GGUF-servable models.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:12:34 +09:00
red person
c7ddfd7dd2 Use shared IMAP timeout for account tests (#1088) 2026-06-02 23:11:04 +09:00
ghreprimand
21b6f9344e Normalize native select option theming (#1178)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-02 23:09:15 +09:00
RosenTomov
37356d8e3e Discover LM Studio via host/port scanning and native-API fingerprint (#1126)
Scan port 1234 and any custom port from LM_STUDIO_URL, add the LM_STUDIO_URL host to the discovery sweep alongside the Ollama env vars, and tag each discovered endpoint with its provider by fingerprinting the native /api/v1/models response (entries carrying key + architecture). Documents LM_STUDIO_URL in .env.example.
2026-06-02 23:04:58 +09:00
Jordan Urbs
c0c1ceb36d Treat Venice as a tool-capable SOTA cloud provider (#1173)
Follow-up to the Venice provider PR. Wire api.venice.ai into the three
host allowlists so Venice behaves like the other paid OpenAI-compatible
clouds:

- agent_loop: add api.venice.ai to _API_HOSTS so the agent sends native
  OpenAI tool-call schemas (Venice supports function calling) instead of
  degrading to fenced-block parsing.
- teacher_escalation: add api.venice.ai to _SOTA_HOSTS so the escalation
  loop stays OFF for Venice (it's a paid top-tier API; no need to add
  teacher-model latency).
- webhook_routes: add venice to KNOWN_PROVIDERS so the sync chat webhook
  can auto-resolve base_url from provider=venice.

Tests: tests/test_venice_hosts.py pins tool-host matching + SOTA
classification for Venice; py_compile on touched modules.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-02 23:03:46 +09:00
Mayank Ukey
3799dc102f fix: ICS export — escape X-WR-CALNAME and honour is_utc on DTSTART/DTEND (#1174)
Two bugs in the export_ics path:

1. X-WR-CALNAME was written raw: calendar names containing commas,
   semicolons or backslashes produced invalid ICS (RFC 5545 §3.3.11
   requires those characters to be escaped as \, \; and \\).
   Fix: wrap cal.name in the existing _ics_escape() helper, which is
   already used for SUMMARY, DESCRIPTION, and LOCATION on the lines
   immediately below.

2. DTSTART and DTEND on non-all-day events always emitted the naive
   ISO string (e.g. 20260602T100000) regardless of CalendarEvent.is_utc.
   Consumers treat a naive datetime as floating/local time, so UTC
   events imported into Google Calendar or Apple Calendar shifted by
   the user's timezone offset.  Fix: append 'Z' when is_utc is True,
   matching the pattern already used by the serialise_event() helper
   at line 408.
2026-06-02 23:02:28 +09:00
RosenTomov
a493fb49b0 Use LM Studio-reported vision capability for image passthrough (#1130)
Read a model's capabilities.vision flag from LM Studio's native /api/v1/models so vision finetunes whose names lack a vision keyword still receive images, falling back to the name heuristic when the endpoint doesn't report it. The probe is short-TTL cached and restricted to local/LAN hosts, so remote/cloud endpoints are never contacted.
2026-06-02 23:01:04 +09:00