odysseus waits on searxng's healthcheck (depends_on: condition: service_healthy),
so when the upstream `searxng:latest` tag is broken the whole app never starts.
The 2026.6.2 image crashes on boot with `KeyError: 'default_doi_resolver'`,
failing the healthcheck and blocking fresh Docker installs (issue #1414).
Pin to the last known-good tag (2026.5.31-7159b8aed) instead of :latest, with a
comment to bump it deliberately after verifying a newer tag boots clean.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Cookbook download path showed its error toasts with the default ~1.2s
duration, so an actionable message like "tmux is required for Cookbook
background downloads/serves … install it with your OS package manager" vanished
before it could be read (issue #1355). The serve path already uses multi-second
durations.
Give the three "Download failed" toasts a 9s duration to match.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: MCP reconnect via tool passes only server_id to connect_server
connect_server requires name, transport, command, args, env, and url
but the reconnect path in do_manage_mcp only passed the server_id,
causing a TypeError on every reconnect attempt. Mirror the pattern
used in mcp_routes.py reconnect_server.
* test: verify MCP reconnect passes full server config to connect_server
Mocks the MCP manager and DB to assert that do_manage_mcp reconnect
passes name, transport, command, args, env, and url — not just the
server_id.
The decorative banner under the title wasn't in a fenced code block, so GitHub's
markdown collapsed its leading whitespace and joined the box-drawing rules,
rendering the ASCII art misaligned instead of monospace-as-typed (issue #1390).
Fence it; the H1 title stays a real heading.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
iOS Safari auto-zooms when a focused input has font-size < 16px. Bump
text-entry controls to 16px under (hover: none) and (pointer: coarse) so
desktop sizing is untouched. Date/time inputs and selects are excluded —
they open native pickers and never zoom.
Doc-editor tiers keep their size hierarchy: Large lands at 18px (above the
16px threshold) instead of collapsing onto Medium, and the email rich-body
Large (17px) is left alone since it was already zoom-safe. All three editor
layers (textarea, highlight overlay, line numbers) move together so the
syntax overlay stays metrically aligned.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
PRs #738 and #644 committed their before/after review screenshots into the
repo (docs/a11y/focus-*.png, docs/a11y/login-*.png, docs/gallery-314-*.png).
Nothing references these files, so they only showed up as "random images" in
the doc folder (issue #1335). The README hero image and the feature preview
clips are referenced and are left untouched.
Add tests/test_docs_no_orphan_images.py to guard against recurrence: it fails
if any image under docs/ is referenced by no tracked text file.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After a deep-research job completes, a follow-up like "check it out" / "read
that report" had the agent web_fetch the /api/research/report/{id} HTML render
(and then drift into unrelated searches) instead of reading the saved report
(issue #1363). The report text is already available via the manage_research
tool (action read), and action list returns ids most-recent-first, so the
agent can resolve "the recent report" itself.
Strengthen the manage_research instructions: read a finished report via
action list -> action read; do NOT web_fetch/app_api the report URL (it renders
HTML, not clean text) and do NOT start a fresh web_search just to read an
existing report. Annotate the app_api endpoint list to say the same.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_detect_nvidia parsed nvidia-smi --query-gpu=memory.total,name and did
float(memory.total) per row, dropping the row on ValueError. Grace Blackwell
GB10 (DGX Spark, sm_121) reports memory.total as '[N/A]'/'Not Supported'
because the GPU shares the system LPDDR pool rather than carrying discrete VRAM
— so the only GPU row was dropped and a real GB10 (even with vLLM running on it)
was reported as 'No GPU', breaking Cookbook recommendations and model switching.
Keep a named device whose memory.total is non-numeric: when there are no
discrete-VRAM rows but such unified devices exist, report a unified-memory CUDA
GPU backed by the system RAM pool (has_gpu, name, backend=cuda, count,
unified_memory=True) — mirroring how Apple Silicon and AMD APUs are already
handled. Discrete GPUs are unchanged, and a box with a real discrete GPU keeps
the discrete path.
Adds tests/test_hwfit_unified_nvidia.py with a GB10 nvidia-smi fixture: the
device is detected (not dropped), surfaces through detect_system with
unified_memory propagated, discrete GPUs stay non-unified, and a discrete GPU
takes precedence over an N/A-memory row.
Co-authored-by: NubsCarson <nubs@nubs.site>
POST /api/skills/{id}/markdown set sk.name = slugify(sk.name or match['name']),
taking the name parsed from the edited markdown frontmatter. A changed name
makes update_skill() move the skill directory on disk and re-key its usage
sidecar, orphaning the original id. The UI still holds that original id, so the
next DELETE /api/skills/{id} fails the name/id lookup and 404s — 'can't delete
them now'.
The audit save path (_apply_skill_md) already guards against exactly this with
sk.name = name and an explicit 'must NEVER rename the skill' comment. Apply the
same pin here: keep the stored name on markdown save (content edits still take
effect; only the rename is suppressed). Drops the now-unused slugify import.
Adds tests/test_skill_save_no_rename.py: saving markdown whose frontmatter
renames the skill keeps the original name and applies the edit, and a
subsequent delete-by-original-id succeeds. Pure unit test — calls the route
handlers directly with a mock Request (no server/network), like
test_skills_delete_owner.py.
Co-authored-by: lalalune <shawgotbags@gmail.com>
When a timezone is configured, `now` is tz-aware local time.
The comparison stripped tzinfo with `.replace(tzinfo=None)`,
producing naive local time, but `scheduled_date` is stored as
naive UTC. For users east of UTC this causes tasks to appear
expired prematurely; for users west they linger past due time.
Use `_to_utc_naive(now)` to convert to the same reference frame.
Deep research generated search queries from the LLM's training-cutoff
knowledge, so it emitted stale-year queries like "best Python tutorials
2025" when the actual year is later (issue #1341). The chat/agent path
already grounds the model with "Today is ..." (src/agent_loop.py); the
deep research planning and query-generation prompts had no equivalent.
Add a small current_date_context() helper and prepend it at the plan and
query-generation prompt sites (and the research_handler plan preview path
that reuses RESEARCH_PLAN_PROMPT). System-TZ local, portable strftime.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If any exception occurred after conn was created but before the
explicit conn.logout() call, the IMAP connection leaked. Use
try/finally to guarantee cleanup on all exit paths.
* feat: document rrule in the manage_calendar tool schema (#1320)
The create_event handler already persists `rrule` (a single event carrying an
iCalendar RRULE), but the manage_calendar tool schema didn't list it, so the
agent had no documented way to make a recurring event and took a roundabout
path. Add `rrule?` to the create_event field list with examples
(FREQ=WEEKLY;BYDAY=MO etc.) and an explicit note to create ONE event with the
rule rather than looping.
Covered by tests/test_calendar_rrule.py: do_manage_calendar create_event with an
rrule stores one event with that recurrence; without it, the event is single.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: restore SessionLocal via monkeypatch in #1320 rrule test (review)
Per review: the test patched core.database.SessionLocal at module import and
never restored it, which could leak the temp DB into later tests in the same
process. Move the patch into an autouse monkeypatch fixture so it is restored
after each test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If c.store() or c.expunge() raised an exception, the connection was
never logged out. Use try/finally to ensure c.logout() is always
called regardless of how the function exits.
* fix: pass owner to start_research in chat stream path
Research launched from the chat stream omits the owner parameter,
causing those research sessions to never appear in the user's
research library (which filters by owner). All other start_research
call sites in this file already pass owner=_user.
* test: assert all start_research calls in chat_routes pass owner
Uses AST inspection to verify every start_research() call site
includes the owner= keyword argument, preventing regressions where
new call sites forget to scope research by user.
* chore: add PR template, issue templates, and triage action
Adds a complete contribution quality layer to reduce maintainer triage burden:
- .github/pull_request_template.md — structured PR description with checklist
enforcing target branch, one-concern rule, CI green, no print(), schema
regeneration, and ADR/CONTEXT.md update requirements
- .github/ISSUE_TEMPLATE/bug_report.yml — required-field YAML form; GitHub
blocks submission until reproduction steps and environment are filled in
- .github/ISSUE_TEMPLATE/feature_request.yml — required problem/proposal fields
with duplicate-check prompt
- .github/ISSUE_TEMPLATE/config.yml — disables blank issues; funnels questions
to Discussions
- .github/workflows/triage.yml — auto-closes issues and PRs from accounts
younger than 7 days, and closes anything with an empty or unfilled body
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: simplify to templates only — drop triage workflow
- PR template: target main (not dev), strip TS/pnpm/ADR checklist items
that aren't enforced in the current codebase yet
- Remove .github/workflows/triage.yml — account-age and auto-close
policy needs explicit maintainer sign-off before automation
Issue templates and config.yml are unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: drop CI-green item — no active CI workflow yet
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: upgrade templates with feedback from #1222 and #1211 thread
Bug report:
- Add install method dropdown (Docker / pip / Windows / macOS)
- Split into separate Expected Behaviour and Actual Behaviour fields
- Add Model / Backend field for LLM-related bugs
- Add prerequisites checkboxes: duplicate search, security vuln redirect,
running latest main
- Add Additional Information free-text field
Feature request:
- Add prerequisites checkboxes (searched issues, searched discussions,
concrete proposal)
- Add area dropdown (Chat/Email/Calendar/Cookbook/etc.) for triage
- Rename and tighten Problem and Solution fields
- Add Prior Art / Related Issues field
- Add Alternatives Considered field
config.yml:
- Replace two generic links with three specific ones: Q&A discussions,
Ideas discussions, and GitHub Security Advisories for vulnerabilities
PR template:
- Rename Summary section with clearer placeholder text
- Add Linked Issue section (Fixes #NNN)
- Add How to Test section with numbered placeholder steps
- Add Screenshots section for UI changes
- Add duplicate-search checklist item
- Remove No print() item (style note, not a structural requirement)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: Cookbook local GGUF serving inside Docker
Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint.
Fixes included here:
use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub
fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts
preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal
Before this change, Docker local serves could fail in several ways:
Cookbook pointed llama.cpp at the wrong GGUF path
generated serve runner scripts crashed before launch with a shell syntax error
successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1
This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration
* test: add test for docker-local endpoint rewrites
* fix: markdown table renders separator row as visible data
The alignment separator (|---|---|) at row index 1 was rendered as a
<td> row with dashes as cell content. Skip it and only open <tbody>
at that point, so tables render as header + data without the garbage
separator row in between.
* test: add regression test for table separator row rendering
Verifies that the markdown table renderer skips the separator row
(|---|---|) instead of rendering it as a visible data row. Also
updates the test harness to handle the splitTableRow import.
_createChip called URL.createObjectURL directly, bypassing the
_getPreviewUrl/_revokePreviewUrl cache. Each re-render of the
attachment strip leaked blob URLs that were never revoked.
* fix: closed document no longer stays active and leaks into new chats (#1160)
Closing a document tab calls _detachDocFromSession: a doc with content is
PATCHed to session_id="" (unlinked, session_id -> NULL, is_active stays True),
an empty one is DELETEd. But the in-memory active-document pointer
(tool_implementations._active_document_id) was never cleared on either path.
The chat doc-injection last-resort looks up that pointer by id and injects it
when `not cand.session_id or cand.session_id == session`. An unlinked doc has
session_id NULL, so the stale pointer re-surfaced a closed document in later,
unrelated chats — the agent kept reading/suggesting edits to a doc the user
had closed.
Fix: add clear_active_document(doc_id) and call it when a document is unlinked
(PATCH session_id="") or deleted, so the pointer no longer resurrects a closed
document. clear_active_document only clears when the id matches (or no id), so a
different active doc is left untouched.
Covered by tests/test_active_document_clear.py (4 cases).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: add route-level regression for #1160 (detach/delete clears active doc)
Per review: prove the actual API path, not just the helper. Drives
PATCH /api/document/{id} (session_id="") and DELETE /api/document/{id}
through TestClient against a temp SQLite DB under real owner routing, and
asserts get_active_document() is cleared (and untouched when a different
document is closed).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: make #1160 route regression hang-proof and dev-DB-independent
The route test could hang in other environments: it set DATABASE_URL at import
time, which is ignored if core.database was already imported, so it fell back to
the real dev DB and could contend for its locks (maintainer saw it hang, exit
124).
Rebind to a DEDICATED temporary SQLite engine (NullPool) and patch the document
route module's SessionLocal to it via an autouse fixture — so the test never
touches the dev DB and is independent of import order. Runs in ~0.3s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: drive #1160 route regression without TestClient (fixes local hang)
The route test used Starlette TestClient (middleware app + threadpool), which
hung in the maintainer's environment. Rework it to call the async route handlers
directly — extracted from the router — with a minimal fake request against a
temp-SQLite-patched SessionLocal. Same real coverage (handler + DB + owner
routing), but it completes reliably (~0.3s) with no TestClient/threadpool.
Verified the maintainer's exact batch now passes:
pytest tests/test_document_close_clears_active_route.py \
tests/test_active_document_clear.py \
tests/test_document_tool_owner_scope.py -> 14 passed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat: CalDAV write-back — push local event create/update/delete to the remote (#800)
CalDAV sync was pull-only (src/caldav_sync.py), so events created, edited, or
deleted in Odysseus on a CalDAV-backed calendar only changed local SQLite and
never reached the server — they silently vanished on the next pull and never
appeared on the user's phone (iCloud, etc.).
This adds the missing write half:
- src/caldav_writeback.py builds the VEVENT, re-discovers the remote calendar by
the same URL-hash the local id was derived from (the remote URL isn't stored),
and PUTs/DELETEs the event by UID via the caldav lib. The pure pieces
(build_event_ical, find_remote_calendar, push_event) take inputs by argument so
they unit-test against a fake client with no network.
- create/update/delete event handlers (routes/calendar_routes.py) call it
best-effort for caldav-sourced calendars only: the local DB stays the source of
truth, a remote failure is logged, never fatal, and local calendars are untouched.
Tests: tests/test_caldav_writeback.py (9, pure logic incl. iCal serialization,
hash discovery, create/update/delete orchestration) and
tests/test_caldav_writeback_route.py (3, route-level: a caldav calendar pushes,
a local one does not, delete pushes a delete). 12 passed.
Note: write-back re-discovers the remote calendar per write (the URL isn't
persisted locally); a follow-up could cache it. Live-iCloud verification needs a
real account — flagging for a maintainer pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: drive #800 route regression without TestClient (fixes local hang)
Same fix as the document route test: the CalDAV write-back route regression used
Starlette TestClient (middleware app + threadpool) which hung in the maintainer's
environment. Rework it to call the async create/delete calendar handlers directly
— extracted from the router — with a minimal fake request, temp-SQLite-patched
SessionLocal, and writeback_event stubbed to record calls. Same coverage (a
caldav calendar pushes, a local one does not, delete pushes a delete), completes
in ~0.3s with no TestClient.
Verified the maintainer's exact batch:
pytest tests/test_caldav_writeback.py tests/test_caldav_writeback_route.py -> 12 passed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Thinking models served via llama.cpp without --reasoning-format none
(e.g. Qwen3, DeepSeek-R1) route all tokens into reasoning_content and
return content="". Two call paths were silently broken:
- llm_call / llm_call_async (non-streaming): hard-keyed
data["choices"][0]["message"]["content"] raises KeyError or returns
empty string, discarding the entire response.
- stream_agent_loop end-of-round fallback: when full_response is empty
but round_reasoning has content, the existing code replaced the
response with the generic empty-response error message, discarding
all reasoning tokens that were correctly accumulated during streaming.
Fix: in both non-streaming paths use msg.get("content") or
msg.get("reasoning_content") or "". In the streaming fallback, surface
round_reasoning as the answer before falling through to the error path.
Completes the reviewer requirement from PR #1190 review that was carried
over but not implemented in #1230:
> "The hard max is a function-local constant. For this setting, the ceiling
> should be configurable or at least represented as a named setting/default
> with tests."
— review on #1190#1230 shipped the adaptive auto-derivation but left `DEFAULT_HARD_MAX = 200_000`
as a hardcoded module constant in src/context_budget.py. Admins on premium
APIs with large context windows (kimi-k2 / minimax-m3 at 1M, etc.) can use
their full window today only by setting `agent_input_token_budget`
explicitly — which then takes them off the adaptive auto-path entirely.
## What this PR changes
- src/settings.py: register `agent_input_token_hard_max` in
DEFAULT_SETTINGS, default 200_000 (matches `DEFAULT_HARD_MAX`). Inline
comment documents the no-op semantics in the explicit branch.
- src/agent_loop.py: read the setting at the call site and pass it as the
`hard_max` kwarg of `compute_input_token_budget`. Defensive parsing —
missing / non-int / zero values fall back to `DEFAULT_HARD_MAX`, so a
misconfig cannot silently zero the budget.
- src/tool_implementations.py: three friendly aliases for `manage_settings`:
- "hard max" -> agent_input_token_hard_max
- "token budget cap" -> agent_input_token_hard_max
- "input budget cap" -> agent_input_token_hard_max
Plus the existing "token budget" -> agent_input_token_budget keeps a
matching shorter alias "input budget".
- tests/test_context_budget.py: 6 new tests on top of the existing 6:
- hard_max raises the auto ceiling (1M ctx + raised cap -> 85% of ctx)
- hard_max lowers the auto ceiling (128K ctx + 50K cap -> 50K)
- hard_max has no effect on the explicit branch
- DEFAULT_SETTINGS contains the new key
- manage_settings aliases are registered
- the live get_setting path returns the override value, and malformed
values fall back per the agent_loop defensive parsing
12 passed in 0.04s. No changes to the pure helper signature or semantics;
#1230's behavior is the default when the new setting is unset.
## How it lets users drop the explicit override
Before this PR, on a 1M-context model:
agent_input_token_budget = 900_000 (explicit) -> 900K [user override]
agent_input_token_budget = <unset> (auto) -> 200K [HARD_MAX]
After this PR, same model:
agent_input_token_budget = <unset>
agent_input_token_hard_max = 900_000
-> min(1M * 0.85, 900K) = 850K [auto, no override needed]
The explicit-override path keeps working unchanged for users who prefer it.
* fix: use SQL false() for owner-less document query (filter(False) raises in SQLAlchemy 2.x)
* test: owner-less document query doesn't pass a bare False to filter
* fix: detect question words as whole words, not prefixes
* fix: same question-word prefix bug in the services search copy
* test: question-word detection rejects prefix lookalikes
dispatch_reminder call on line 699 references _gcu(request) which is
never defined. The local helper wrapping get_current_user is _owner.
Every POST to /api/notes/fire-reminder raises NameError and returns 500.
* fix(mcp): invalidate tool prompt cache on connect/disconnect/error
get_tool_descriptions_for_prompt cached its result keyed only on
(disabled_map, len(_tools)). If a server reconnects with the same
tool count (or transitions to error state), the cache was never
busted — the agent received stale tool descriptions for the new
connection state.
Add a _generation counter incremented on every structural change
(successful connect, disconnect, connection error) and include it in
the cache key.
* test(mcp): regression test for _generation cache invalidation