Commit Graph

276 Commits

Author SHA1 Message Date
Kenny Van de Maele
0a2adc9c96 Add ask_user tool: agent-posed multiple-choice questions (#2111)
Let the agent pause and ask the user a multiple-choice question when a
task is genuinely ambiguous and the answer changes what it does next —
choosing between approaches, confirming an assumption, picking a target —
instead of guessing.

Modeled on the existing `ui_control` marker pattern: the `ask_user` tool
returns an `ask_user` payload that the agent loop emits as an SSE event
and then ends the turn. The frontend renders the question with clickable
option buttons, a free-text "Other" input, and an x to dismiss; the user's
choice is sent as the next message and the agent resumes with it in
context.

- src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O.
  Validates a non-empty question + 2..6 options, normalizes string/object
  options, returns the payload.
- src/agent_loop.py: emit the `ask_user` event and break the round loop so
  the turn ends and waits for the user's selection. Stream the question as
  assistant text so it persists/replays (prevents a re-ask loop).
- Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS,
  FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any
  user can be asked); the structured args serialize via the default
  json.dumps path.
- routes/chat_routes.py: relay the `ask_user` event to the client.
- static/js/chat.js + static/style.css: render the question card (options +
  free-text Other + dismiss x; removed once answered). Reuses CSS vars and
  the .modal-close button; emoji go through the monochrome-SVG pipeline.
  Bump chat.js cache pin.
- tests/test_ask_user_tool.py: payload, multi flag, string options, option
  cap, validation errors, serializer round-trip, registration.
2026-06-05 11:49:11 +02:00
Kenny Van de Maele
367858a587 Merge branch 'main' into dev
Bring main's maintainer-curated work (cookbook scheduler, calendar rendering/sync, settings polish, agent debug loop) into dev so dev is a superset of main (resolves the dev/main drift, #2543).
2026-06-05 10:50:51 +02:00
Vykos
11ba46505b Constrain generated-image paths to image root (#2837) 2026-06-05 10:33:47 +02:00
nubs
8b386a172e fix(calendar): route read requests to agent (#2452) 2026-06-05 09:24:04 +01:00
Nicholai
4df4cfeaff Merge pull request #2387 from cirim-au/fix/manage-memory-always-available
fix(tool_index): add manage_memory to ALWAYS_AVAILABLE
2026-06-05 02:14:10 -06:00
nubs
5271d529d6 fix(tool-schemas): preserve web_search time_filter through native tool-call conversion (#2757) 2026-06-05 08:00:59 +01:00
pewdiepie-archdaemon
fbd34334a5 Calendar overnight-event rendering + clickable [View note] link from chat
- Calendar overnight events render proportionally across day boundaries
  via --start-frac / --end-frac CSS vars instead of bleeding as full-day
  on day 2.
- Recurring-event delete strips the master uid + all master::* sibling
  instances optimistically so the row clears immediately instead of
  waiting for the next sync re-render.
- manage_notes(create) now returns note_id + open_url, and agent_loop
  appends a markdown [View note](#note-<id>) link mirroring the
  deep-research pattern.
- chatRenderer's hash-link router (already wired for #note-id) reaches
  the new notes.openNote(id) helper, which force-closes/reopens the
  Notes panel, polls for the target card, and runs a brief outline
  flash so the user can locate it on long lists.
2026-06-05 14:41:48 +09:00
pewdiepie-archdaemon
e2f449f4ef Cookbook scheduler + serve: schedule via Tasks, Stop verifies kill, Ollama auto port-pick
- Schedule cookbook serves through the existing ScheduledTask system: the
  serve preset gets a ^ button next to Launch that opens a daily/hourly/
  weekly form mirroring the admin-switch style; the schedule action runs
  action_cookbook_serve, which delegates to /api/model/serve and stamps
  the resulting task with _scheduledStopAtMs. A background
  cookbook_serve_lifecycle loop ticks every 60s and kills any serve
  whose window has ended, also dropping the auto-registered endpoint
  so the model picker doesn't keep pointing at a dead server.
- Stop and remove on a Running serve now awaits the SSH/tmux kill,
  re-checks tmux has-session, and surfaces an error toast (leaving the
  row) when the kill failed. Previously fire-and-forget, so a failed
  SSH/tmux call silently left the live serve running while the row
  vanished from the UI.
- Cookbook tasks/status orphan-adoption sweep no longer requires the
  serve-/cookbook- session-id prefix; any tmux session whose pane is
  running a known model-server process gets auto-pulled into Running.
  Without this loosening, a cookbook-launched serve whose tmux id
  fell back to a bare number was invisible — you couldn't see it,
  let alone stop it.
- Ollama serve always launches a fresh process under cookbook's tmux
  (no more monitor-mode reattach to a systemd/Docker ollama Stop can't
  reach). The handler pre-picks a free port by probing the target
  host over SSH and mutates req.cmd's OLLAMA_HOST so the runner script
  AND the auto-registered endpoint agree on the same bind port.
- Auto-register uses host.docker.internal (when running inside Docker)
  instead of localhost, matching the URL /setup adds for Ollama by
  hand. Local cookbook serves now produce a chat-reachable endpoint
  on first launch.
- Cascade-delete: removing a scheduled cookbook task also deletes any
  linked calendar event (cookbook_task_id marker in the description).
- Tasks list groups cookbook_serve under a "Cookbook" category that
  sorts above the rest, so scheduler-launched serves are easy to find.
2026-06-05 14:41:43 +09:00
pewdiepie-archdaemon
f8aaeab245 Merge remote-tracking branch 'origin/dev' 2026-06-05 12:14:34 +09:00
pewdiepie-archdaemon
f19ac6ed03 Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus
# Conflicts:
#	static/js/cookbookRunning.js
2026-06-05 11:23:15 +09:00
nubs
ae48ea7064 fix(mcp): sanitize and cap rendered MCP tool param hints (#2682) 2026-06-05 03:00:22 +02:00
nubs
19a3fc59c9 fix(model-context): key context-window cache by (endpoint, model) (#2614)
get_context_length() cached the resolved context window by model id alone,
so two different remote endpoints serving the same model id (e.g. a capped
proxy at 8k vs. the full provider at 200k) collided: the first to resolve
won process-wide and the other endpoint was served the wrong window. That
silently over-trims conversations on the larger-window endpoint (it feeds
context_compactor) or overflows the smaller one (provider 400s).

Key the cache on (endpoint_url, model). Local endpoints already always
re-query, so they are unaffected.

Fixes #2603
2026-06-05 02:50:56 +02:00
L1
f8cf791491 fix(caldav): don't prune locally-created events on sync (#2706)
The CalDAV pull prunes events in the synced calendar+window whose UID the
server didn't just return, to propagate upstream deletions. But CalendarEvent
had no field distinguishing a server-pulled row from a locally-created one, so
the prune also deleted events that were never on the server: events created by
the agent / email triage (which never write back to the server) and UI events
whose best-effort write-back failed. Result: silent, unrecoverable loss of the
user's appointments (hard db.delete, no soft-delete).

Add an 'origin' column to calendar_events (lightweight idempotent migration,
mirroring _migrate_add_calendar_is_utc), set origin='caldav' on rows the sync
inserts/updates, and gate the prune on origin == 'caldav'. Locally-created
events carry origin NULL and are never pruned. On the first sync after the
migration nothing is pruned (all rows NULL until re-marked), erring toward
keeping data.

Fixes #2704
2026-06-05 02:48:03 +02:00
Abylaikhan Zulbukharov
1d80bf5e65 feat(mcp): add Streamable HTTP transport with OAuth 2.0 (#1033)
* feat(mcp): add Streamable HTTP transport with OAuth 2.0

  Odysseus could only reach MCP servers over stdio and SSE, so modern
  remote servers like https://mcp.higgsfield.ai/mcp (Streamable HTTP,
  gated behind OAuth) could not be connected.

  Add an `http` transport that connects via the SDK's
  streamablehttp_client and authenticates with the SDK's
  OAuthClientProvider: RFC 9728 protected-resource discovery, RFC 8414
  authorization-server metadata, Dynamic Client Registration,
  authorization-code + PKCE, and token refresh. A small bridge
  (src/mcp_oauth.py) connects the SDK's blocking callback to the existing
  web callback route via an asyncio.Future keyed by the OAuth `state`,
  and the dynamic client registration plus tokens persist per-server in a
  new encrypted `oauth_tokens` column.

  The connect runs as a bounded background task so the "Add server"
  request returns immediately; redirect_handler publishes needs_auth +
  auth_url to connection state as soon as discovery/DCR completes (which
  can exceed the bounded wait), and the UI polls until connected. Remote
  users finish via the existing paste-back flow. The Google OAuth path is
  left unchanged.

  - core/database.py: encrypted oauth_tokens column + migration
  - src/mcp_oauth.py: OAuth provider, DB-backed TokenStorage, state registry
  - src/mcp_manager.py: http dispatch, background connect, _connect_http
  - routes/mcp_routes.py: http validation, needs_auth/auth_url, callback bridge
  - static/js/settings.js: Streamable HTTP option + OAuth flow with polling
  - tests: 5 new unit tests (transport dispatch, registry, token storage)

  Verified against the live Higgsfield server: discovery, DCR (client_id
  issued), loopback redirect accepted, and a PKCE authorization URL with
  needs_auth status. No regressions (full suite delta is only the 5 added
  passing tests).

* fix(mcp): address PR #1033 review feedback

  - mcp_oauth: derive redirect URI from OAUTH_REDIRECT_BASE_URL/APP_PUBLIC_URL
    (default http://localhost:7000) instead of hardcoding the port
  - mcp_oauth: leave OAuth scope unset so the SDK derives it from the server's
    WWW-Authenticate/protected-resource metadata; hardcoding an OIDC scope broke
    non-OpenID MCP servers (verified: Higgsfield still gets its server-derived
    scope)
  - mcp_oauth: prune abandoned OAuth flows (_prune_stale + _pending_ts) so the
    module-level registries can't grow unbounded
  - mcp_oauth: persist tokens/client-info in a single DB session/commit
    (_update) instead of a load+save double round-trip
  - mcp_manager: cancel and drop the background connect task in
    disconnect_server so a deleted server stops publishing status
  - database: document why the oauth_tokens migration uses TEXT while the model
    declares EncryptedText (encryption is applied at the Python layer)
  - settings.js: surface persistent OAuth-poll failures and an explicit timeout
    message instead of silently swallowing errors
  - tests: cover the stale-flow pruning

* static/js/settings.js now shows an in-flight loading state on the buttons that fire requests:
2026-06-05 02:40:52 +02:00
Isaiah Gardner
134c608466 fix: degrade missing/None content key in system messages to empty string (#2570) 2026-06-05 00:10:11 +02:00
Kenny Van de Maele
2be3779e6e feat: Add workspace: confine agent tools to a folder (#1103)
* feat: Add workspace: confine agent tools to a folder

Pick a server folder as the agent's workspace so its file/shell tools work
there and don't touch files outside it. File tools are hard-confined; bash/
python run with cwd set to the folder.

Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` /
`clear` / `pick` (open the directory browser).

- routes/workspace_routes.py: GET /api/workspace/browse (admin-only).
- src/tool_execution.py: hard path confinement for read_file/write_file;
  bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block.
- src/agent_loop.py: workspace note prepended to the system prompt.
- static/: overflow menu item, input-bar pill, directory-browser modal, and
  the /workspace slash command.
- tests/test_workspace_confine.py.

* Wire workspace confinement into tools that landed after this PR

edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was
written, so they bypassed the workspace boundary. Thread the workspace through:
  - edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace
  - grep/glob/ls: _resolve_search_root confines to the workspace (root + paths)
  - bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir
    default when no workspace is set)
Tests cover edit_file + grep/ls confinement (inside ok, outside rejected).

* Workspace picker: editable path bar + modal style cohesion + cross-platform hardening

- Make the current-folder strip an editable address bar: type/paste a full
  path and press Enter to navigate (also reaches other Windows drives and
  hidden dirs the up-only browser cannot).
- Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn*
  in favour of base .modal-content/.modal-body and the .confirm-btn button
  family; separators/hover use var(--border). Net -31 CSS lines.
- Fix the path field overflowing the modal right edge (flex stretch + margin
  vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin.
- Cross-platform confinement: normcase the workspace commonpath check so
  containment holds on case-insensitive filesystems (Windows/macOS).
- Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd()
  instead of pwd. 5 pass.
2026-06-05 00:06:37 +02:00
Kenny Van de Maele
7b4365fe57 Make write_file/edit_file always-available like read_file (#2684)
read_file/grep/glob/ls are in ALWAYS_AVAILABLE but the on-disk write tools
(write_file, edit_file) were only surfaced via per-query tool-RAG retrieval.
On a bare 'edit X' request the retriever could miss them, so the model was
never offered edit_file/write_file and wrongly fell back to edit_document
(editor panel) or improvised with bash sed. Add both to ALWAYS_AVAILABLE
next to read_file; they stay admin-gated by tool_security so non-admin
exposure is unchanged.

Fixes #2683
2026-06-05 00:02:14 +02:00
pewdiepie-archdaemon
a260e0abd4 Revert calendar-based cookbook scheduler
Reverts b98ee04 + 4ed48ba + a19b6d2.

Calendar events turned out to be the wrong abstraction for scheduling model serve windows. Pivoting to the existing ScheduledTask infrastructure (cron / daily / weekly recurrence, next_run tracking, edit-from-Tasks-tab UI) in a follow-up commit. The ScheduledTask path:

  - reuses dispatch logic the rest of the app already understands
  - drops the calendar dependency entirely (no auto-created "Cookbook" calendar, no calendar.js hook)
  - shows up in the Tasks UI that already exists for everything else

What this revert removes:
  - src/cookbook_scheduler.py — calendar reconciler
  - routes/cookbook_schedule_routes.py — /api/cookbook/schedule/* endpoints
  - static/js/cookbookSchedule.js — Schedule modal / settings card
  - cookbook_scheduler_enabled + cookbook_schedule_calendar_href settings keys
  - The window.cookbookOpenScheduleForm hook in calendar.js
  - The Schedule button + paired-button CSS in cookbookServe.js + style.css
2026-06-05 06:57:21 +09:00
Michiel Van de Velde
7ddc5eaef4 Merge pull request #2529 from NubsCarson/codex/2509-mcp-tool-input-params
fix(mcp): expose MCP tool input parameters to the agent
2026-06-04 23:07:42 +02:00
Kenny Van de Maele
64d65b73c1 feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)
* feat: round-limit handling — Continue affordance at the cap + configurable cap

When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:

1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
   a "Continue" pill at the bottom of the chat that resumes the task from where
   it left off. Repeated cap-hits each get a fresh Continue (multiple continues
   in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
   validated on the client, at the save endpoint, and at the read site.

- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
  tool-executing round completes on the last allowed round — i.e. the agent
  wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
  `max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
  agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
  existing .stopped-indicator / .continue-btn classes and theme vars
  --border/--fg/--bg/--accent); appended to the chat container so it survives
  the message re-render at stream finalize. chat.js cache version bumped.

* test: cover rounds_exhausted emission (cap-hit vs normal finish)

Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
2026-06-04 22:36:05 +02:00
Kenny Van de Maele
1cd0aa2b8c feat(provider): add GitHub Copilot provider with device-flow auth (#1480)
* feat(provider): add GitHub Copilot provider with device-flow auth

Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5,
Claude, Gemini, …) work through the normal chat + agent loop, incl. native
tool calling and vision.

Auth is one-click via the GitHub OAuth device flow; the access token is stored
as the endpoint's (encrypted) api_key and sent directly as `Authorization:
Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to
the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only
provider-specific behaviour is a small set of required request headers,
injected centrally.

Sign-in is available from Settings → model endpoints ("Connect GitHub
Copilot") and from chat via `/setup copilot`.

- src/copilot.py (new), routes/copilot_routes.py (new): constants, header
  builders, device-flow start/poll, model discovery, owner-scoped endpoint
  provisioning.
- src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers,
  per-request x-initiator/vision.
- src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas.
- src/model_context.py: known context windows for Copilot (no unauthenticated
  /models probe).
- static/, README, tests/test_copilot*.py.

* Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process
2026-06-04 21:13:14 +02:00
ooovenenoso
ab5311c44d fix(research): support timeout defaults in direct tests (#2624)
fix(research): honor planning query timeouts
2026-06-04 20:23:17 +02:00
Giuseppe
6d511f6e66 fix(llm): auto-detect <think> in content stream for unregistered thinking models (#2588)
* fix(llm): auto-detect <think> in content stream for unregistered thinking models

_THINKING_MODEL_PATTERNS only covers known model families by name. Qwen3-derived
models with non-standard names (e.g. Qwopus, custom QwQ forks) are not matched,
so their <think>...</think> content streams through as visible chat text instead
of being routed to the thinking display.

When the first content delta opens with <think> and the model was not already
identified as a thinking model, dynamically flag the stream as a thinking model
for the remainder of the response. This enables the existing </think> repair path
(line below) and ensures the frontend receives the full <think>...</think> wrapper
it needs to split thinking from the final answer.

The check is restricted to the very first content delta (_first_content_sent is
False) to avoid misidentifying models that happen to write "<think>" mid-answer.

Fixes #2225
Related: #2420 (covered by separate PR from @AmmarS-Analyst), #2224 (@RaresKeY)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace inert _thinking_model flag with _in_think_tag state machine

The original auto-detect set _thinking_model=True on the first <think> chunk
but still emitted it as a regular delta and set _first_content_sent=True
immediately, so no subsequent chunk could enter the repair path.

Replace with _in_think_tag bool: enter thinking mode when first content starts
with <think>, route all chunks to the thinking channel until </think> is found,
then the tail becomes the first regular delta. Adds three regression tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace _first_content_sent guard with _think_open_stripped

Opening-tag stripping used `not _first_content_sent` as the guard, but
_first_content_sent stays False throughout the entire think block (it only
flips when regular content is emitted). So `find(">")` ran on every
reasoning chunk — not just the first — and silently truncated everything
before the first ">" in any reasoning text containing comparisons, arrows,
or code.

Fix: add `_think_open_stripped = False` alongside `_in_think_tag`. Use it
as the strip guard in both the "still inside <think>" path and the
"</think> found in same chunk" split path. Set it True once the opening
tag is consumed so all subsequent chunks reach the thinking channel
unmolested.

Add regression test: 3-chunk stream where the middle chunk contains
"c > d" — confirms "more c " is not dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:18:19 +02:00
Giuseppe
dd707ddb1e fix(agent): default bash/python cwd to data/ to prevent ephemeral file loss (#2586)
Agent subprocesses (bash, python) previously inherited the container's default
working directory (/app), so files created with relative paths landed in the
ephemeral container layer and were silently destroyed on any docker compose up
--build or container recreation.

Set cwd=_AGENT_WORKDIR (resolved to <repo_root>/data at import time) and
HOME=_AGENT_WORKDIR on both subprocess launchers so that:
- pwd inside a bash tool returns the persistent data directory
- relative paths and ~ resolve to a location that survives rebuilds
- the agent can still cd to any absolute path it needs

The resolution uses pathlib.Path(__file__).parent.parent / "data", which
works for both Docker (/app/src → /app/data) and manual installs
(<repo>/src → <repo>/data) without requiring a new env var or compose change.

Fixes #2512

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:16:04 +02:00
pewdiepie-archdaemon
4ed48baf68 Cookbook scheduler: inline settings card at the top of the Cookbook tab
The earlier scheduler commit shipped the backend + Schedule modal but left the feature dormant — no way to toggle it from the UI. This adds the missing knob:

* DEFAULT_SETTINGS gains `cookbook_scheduler_enabled` (False) and `cookbook_schedule_calendar_href` ("") so `/api/auth/settings` POST will actually persist them. Without this, the POST silently dropped unknown keys.

* cookbookSchedule.js gains a self-contained settings card injected at the top of the Cookbook tab body whenever the cookbook modal opens. Card contents:
  - Enable toggle (writes cookbook_scheduler_enabled)
  - Calendar dropdown populated from /api/calendar/calendars (writes cookbook_schedule_calendar_href)
  - Status line: off / pick-a-calendar / N scheduled in next 24h · M running now · K skipped
  - "Reconcile now" button that POSTs /api/cookbook/schedule/reconcile-now

* The same module reveals/hides the Schedule… buttons on serve panels whenever the feature flag changes, so toggling on immediately surfaces the schedule UI without a refresh.

Settings UI lives in cookbookSchedule.js (not settings.js) so the entire scheduler surface — backend, reconciler, modal, settings — collapses cleanly: delete src/cookbook_scheduler.py + routes/cookbook_schedule_routes.py + static/js/cookbookSchedule.js, drop the two DEFAULT_SETTINGS keys, and the two app.py registration lines, and the feature is gone.
2026-06-05 02:40:35 +09:00
Giuseppe
531f426557 fix: KeyError on missing 'content' key in system messages (#2362)
A system message that arrives without a 'content' key — possible via
malformed tool results — raised a KeyError in the hot path of llm_call,
llm_call_async, and stream_llm. Replace m["content"] with
m.get("content") or "" in all three functions so a missing key degrades
to an empty string instead of crashing.

Also removes a redundant .rstrip() after .strip() in _model_activity_key.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:38:45 +02:00
Giuseppe
ff8f9f2188 fix: llm_call_async does not retry on HTTP 429/502/503/504 (#2364)
The retry loop raised immediately for any non-success HTTP response
regardless of attempt count. For transient upstream errors (rate limit,
bad gateway, gateway timeout) the function should back off and retry
within the existing attempt budget.

Also lets ConnectError / ConnectTimeout retry when the host has not been
cooled and attempts remain, instead of always raising on the first
connect failure.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:35:55 +02:00
pewdiepie-archdaemon
a19b6d2d4d Cookbook scheduler: calendar events drive model serve windows (experimental, feature-flagged)
Add a calendar-driven scheduler so a user can pick a model in Cookbook, click "Schedule…" instead of "Launch", choose time windows + days of the week + (optional) end date, and have Odysseus auto-launch the serve when the window starts and hard-kill it when the window ends. The calendar IS the source of truth — events on a designated calendar are interpreted as serve schedules, so editing the event in the calendar UI immediately changes the schedule.

Whole feature is gated by setting `cookbook_scheduler_enabled` (default False). Disabling the setting silences the reconciler and the API refuses requests; setting + three new files = entire surface, easy to revert.

New files:
  - src/cookbook_scheduler.py — background reconciler: ticks every 60s, reads next ±90s of calendar events on the designated calendar, launches/kills serves to match. Honors "refuse if GPUs busy" (skips with reason, no retry). Adopts pre-existing manual serves matching the event's model so window-end cleanup still applies. Tags scheduler-owned tasks with `_scheduledBy: <event_uid>` so it never kills serves it doesn't own.
  - routes/cookbook_schedule_routes.py — POST /api/cookbook/schedule/from-cookbook builds RRULE+ICS events from the modal's input (model, slots[], days[], until). GET /upcoming returns the next 24h with per-event status (scheduled / running / adopted / skipped / failed / ended) for the UI. POST /reconcile-now manually kicks the reconciler.
  - static/js/cookbookSchedule.js — Schedule button click handler + modal. Daily/hourly time slot picker, multi-slot ("+ add another time slot"), weekday chips with Weekdays/Weekend/Every-day quicksets, optional Until date. Calls /from-cookbook on save. Whole module is a single IIFE; deleting the file plus its <script> tag removes the UI surface.

Existing files touched (minimal):
  - app.py: register the new router + add the reconcile loop as a startup task (~10 lines, all in one block). Reconcile loop checks the feature flag on every tick, so leaving it running with the flag off costs ~one settings lookup per minute.
  - static/index.html: one new <script> tag for cookbookSchedule.js.
  - static/js/cookbookServe.js: add a "Schedule…" button next to the existing Launch button. Hidden by default; cookbookSchedule.js reveals it after confirming the feature flag is on.
  - static/style.css: ~80 lines for the modal styles (mobile-aware via @media).

User choices baked in:
  - Calendar events are the source of truth.
  - Refuse to launch if GPUs busy (skip + log reason in scheduler.events[uid].reason).
  - Hard kill at event end.
  - No retry on a skipped event within the window.
  - Multi-slot per day supported (one calendar event per slot, shared RRULE).
  - Pre-existing manual serves get adopted at window start so they're killed at end.

Known follow-ups (not in this commit):
  - Settings UI to pick the schedule calendar + toggle the feature flag.
  - Calendar event color/badge for status (running/skipped/failed).
  - "Lazy launch on first request" — currently launches at event start. Replacing _launch_serve with a proxy that defers vllm until the first chat request is a contained future change.
2026-06-05 02:35:23 +09:00
RaresKeY
c12c2aa233 fix: normalize Gemma 4 thought-channel output (#2224) 2026-06-04 19:26:58 +02:00
nubs
935eb05c63 refactor(search): make src analytics a service shim (#2264) 2026-06-04 18:57:24 +02:00
Kenny Van de Maele
1f00fff837 feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)
Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
2026-06-04 18:37:32 +02:00
Kenny Van de Maele
7443c36bd9 feat: Add edit_file tool + file-change diffs (#1239)
* Add edit_file tool + file-change diffs

edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.

Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
  - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
    so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
    refuses it for non-admin owners.
  - confined by the same path policy as read_file/write_file (allowlist +
    sensitive-file deny) via _resolve_tool_path.

Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.

Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.

Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.

* Drop redundant import os in write_file closure

os is already imported at module top.
2026-06-04 18:29:10 +02:00
Kenny Van de Maele
8bfd79fe8e chore: deduplicate src/search modules (cache, content, query) into shims (#2506)
* chore: dedupe src/search/cache.py into a re-export shim

src/search/cache.py was a byte-identical copy of services/search/cache.py.
Convert it to a sys.modules alias of the canonical services module (matching
src/search/core.py, providers.py, ranking.py) so the two cannot drift, and add
an identity assertion to test_search_module_consolidation.py.

content.py and query.py are intentionally left as-is: the copies have drifted
and services lacks fixes that src has, so they need services reconciled first
before they can be shimmed safely.

* chore: dedupe src/search content.py and query.py into shims

Convert src/search/content.py and query.py to sys.modules aliases of the
canonical services/search/* (matching cache.py, core.py, providers.py,
ranking.py) so the duplicate copies cannot drift.

Repoint the two tests that were coupled to the src-copy internals onto the
canonical services surface (behaviour is equivalent):
- test_src_search_query_nonstring.py: import services.search.query instead of
  loading the src file by path.
- test_security_regressions.py::test_web_fetch_guard_blocks_redirect_into_private:
  mock httpx.get (services uses the module-level get, not httpx.Client) and
  assert on the canonical 'Blocked' message.

Drop the now-redundant [src_content, service_content] parametrization in
test_search_content_extraction_parity.py and test_search_content_url_guards.py
(after the shim both params are the same object); add content/query identity
assertions to test_search_module_consolidation.py.
2026-06-04 18:10:55 +02:00
Nicholai
c916224510 feat(memory): add provider interface (#72) 2026-06-04 16:26:11 +01:00
pewdiepie-archdaemon
9112861d8e cookbook agent debug loop: persistent log files, auto-adopt orphan tmux, Codex/Claude skill parity
Three converging fixes so the chat agent + external Codex/Claude skills can actually debug a crashed serve instead of staring at a post-crash neofetch banner:

* Serves now `tee` to /tmp/odysseus-tmux/SESSION.log on the host running them. Runner saves fds 3/4 before the tee and restores them right before `exec ${SHELL}`, so the post-crash interactive zsh banner does NOT pollute the log file.
* `tail_serve_output` (chat agent) and `/api/codex/cookbook/output/{sid}` (Codex+Claude skills) both prefer the persistent log file over the tmux pane. Pane is fallback for sessions predating the tee runner. Default tail bumped 150 -> 400.
* `list_served_models` "recent log" snippet seeks to the Traceback line instead of showing the last 6 lines (which was always the bash prompt).

Cookbook auto-adoption sweep on `/api/cookbook/tasks/status`: every 20s (rate-limited) the cookbook SSHes each configured server, finds `serve-*` / `cookbook-*` tmux sessions running an actual model process (vllm/python/llama-server/etc., filtered via `pane_current_command`), and writes them into state.tasks. So when the agent falls back to raw ssh+tmux, the session appears in the Cookbook UI on the next poll.

`serve_model` error path now reads `data["detail"]` in addition to `data["error"]` so the FastAPI HTTPException message ("Invalid characters in cmd") actually reaches the agent instead of being swallowed as a generic "Serve failed". Tool description updated to warn against `cd …`/`source …`/`&&` prefixes.

Intent-without-action supervisor in agent_loop: when the model writes "Let me tail the output" / "I'll check the logs" / "Let me investigate" and ends the turn without emitting a tool call, the loop injects a sharp system nudge ("You said you would X — DO IT NOW") and continues. Capped at 2 nudges per chat so a model that genuinely cannot use the tool does not pin the loop.

Codex/Claude skill parity: adds `/cookbook/cached`, `/cookbook/presets`, `/cookbook/preset/{name}`, `/cookbook/adopt` so external agents have the same surface as the chat agent. SKILL.md docs + odysseus_api.py wrapper updated for both bundles.

`adopt_served_model` promoted to the always-on tool set so the agent has a documented fallback when serve_model rejects a cmd.

Also various cookbook UI tweaks accumulated alongside the above (cookbook.js, cookbookRunning.js, cookbookServe.js, cookbook-diagnosis.js, settings.js, style.css).
2026-06-04 23:27:18 +09:00
Wes Huber
93b3e108a6 fix: re-export _SPORTS_HINT_RE from search ranking shim (#2273)
The compatibility re-export shim at src/search/ranking.py forgot
_SPORTS_HINT_RE, so tests importing src.search.ranking raised
AttributeError on the [src] parametrize variant.

Fixes #1995

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 14:24:53 +01:00
NubsCarson
39825867a4 fix(mcp): route literal MCP requests to external schemas 2026-06-04 13:00:17 +00:00
Giuseppe
bc9104efe2 fix: SSE stream parser crashes with NoneType on providers sending null choice/usage/tc entries (#2389)
* fix: SSE parser crashes with NoneType on MiniMax-M3 (and any provider sending null choice/usage/tc)

Three guards added in stream_llm:

1. choices[0] null check — MiniMax (and some other providers) send a
   choices entry as None. `_choices[0].get("delta")` raised
   AttributeError. Now checks `_choices[0] is not None` before calling
   .get().

2. usage null guard — j["usage"] can arrive as None (not a dict) on
   some providers. Added `or {}` so subsequent .get() calls don't crash.

3. tool_calls null entry skip — individual entries in the tool_calls
   array can be None. Added `if tc is None: continue` before
   tc.get("function").

All three match the `or {}` / null-guard pattern used elsewhere in the
same block. Safe for all OpenAI-compatible providers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: guard null choice in elif-choices SSE branch

The usage-chunk path already guarded _choices[0] is not None, but the
elif "choices" branch that processes content/tool-call deltas did not.
A chunk like {"choices": [null]} or {"choices": [null], "usage": null}
reaches j["choices"][0].get("delta") and crashes with:

    'NoneType' object has no attribute 'get'

Fix: extract choices[0] into _c0 and continue to the next chunk when
it is None, matching the guard already applied in the usage path.

Adds three focused regressions covering the paths the maintainer flagged:
- {"choices": [null]}
- {"choices": [null], "usage": null}
- tool_calls array containing a null entry alongside a valid call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 13:53:10 +01:00
NubsCarson
019f8d614f fix(mcp): expose MCP tool input parameters to the agent
MCP server tools were presented to the agent with only their name and a
truncated description: get_tool_descriptions_for_prompt() emitted
"- name: description" and get_all_tools() dropped input_schema entirely.
On the fenced-block tool path (used by Ollama models), the agent could
not see a tool's declared inputs and guessed argument names from the
description alone, so tool calls failed (issue #2509). MCP inspector
showed the schemas fine, confirming the loss was on our side.

- get_all_tools() now carries each tool's input_schema.
- get_tool_descriptions_for_prompt() renders a compact args hint
  (parameter names, coarse types, required-ness) via a new
  _format_mcp_params() helper, matching the "Args (JSON): {...}" style
  the built-in tool descriptions already use.

Fixes #2509
2026-06-04 12:51:31 +00:00
Joeseph Grey
fa1fe7f866 security: sanitize rendered research-report HTML (#364)
The visual research report is assembled from LLM output over crawled web
pages (untrusted content) and served under a relaxed `script-src
'unsafe-inline'` CSP. Two values reached that HTML without sanitization:

- `_md_to_html` rendered the report markdown via python-markdown, which
  passes raw HTML through verbatim, so `<script>` / `<img onerror>` /
  `<svg onload>` / `javascript:` links carried in crawled content ran in
  the app origin.
- `category` (from the /api/research/start request body, no enum check) was
  interpolated raw into `<body class="category-{category}">`.

Allowlist-sanitize the rendered markdown with nh3, keeping the formatting
the report emits (tables, code, details/summary, toc anchors, codehilite
classes, external-link target/rel) while dropping active content, and
html.escape the category. Adds regression tests.
2026-06-04 13:42:49 +01:00
Massab K.
594775dc4b Fix issue 135 chat context bleed (#281)
* Fix issue 135 chat context bleed

* Guard task delivery metadata access
2026-06-04 13:27:46 +01:00
Alexander Kenley
7b45a94b6d Fix calendar routing and user-local time context (#408)
* fix(chat): add user-local time context

* fix(chat): route calendar follow-up phrasing

* refactor(chat): log tool intent routing reasons

* test(chat): align user time prompt shim

---------

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-04 13:20:04 +01:00
tanmayraut45
f59edee611 Support extra CA bundle for private-CA LLM providers (#769)
Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a
model endpoint fails on first probe with

    CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate
    chain (_ssl.c:1000)

because their TLS chain is signed by a private root CA (Russian Trusted
Root CA for GigaChat; corporate CA for on-prem) that isn't part of the
default system / certifi trust store. The endpoint shows offline in
the picker even though the URL and API key are correct (issue #722).

The right fix is to extend the trust store, not to weaken verification.
This change:

- src/tls_overrides.py: new module that resolves an opt-in env var
  LLM_CA_BUNDLE at import time, builds a shared SSLContext via
  ssl.create_default_context() (so the system / certifi bundle is
  loaded first) and layers the operator's PEM on top with
  load_verify_locations(). Exposes llm_verify() returning a value
  suitable for httpx `verify=`. Defaults to True (httpx built-in
  trust) when the env var is unset, when the file is missing, or
  when the PEM fails to load — verification is never silently
  disabled, the warning is logged and we fall back to the safe path.

- src/llm_core.py: thread llm_verify() into the shared AsyncClient
  used by stream_llm / streaming completions.

- routes/model_routes.py: thread llm_verify() into the five httpx.get
  call sites in _probe_endpoint / _ping_endpoint so adding a
  private-CA endpoint goes green on the very first probe and the
  picker stops showing it offline.

- .env.example: document LLM_CA_BUNDLE with the GigaChat case as the
  concrete example.

Deliberately NOT included: a verify=False knob (global or per-host).
Disabling verification exposes the affected endpoint to MITM, and the
operator-supplied bundle is the correct fix for legitimate private-CA
providers — so the only switch in this PR is the safe one.

Closes #722.
2026-06-04 13:18:50 +01:00
Giuseppe
f6a5f6592f fix: log warnings on silently swallowed agent and endpoint failures (#2367)
get_builtin_overrides() was swallowing all exceptions with a bare
`except Exception: pass`, so misconfigured tool-description overrides
would silently produce wrong agent behaviour with no log trace.

The background endpoint refresh loop had the same pattern: any probe
failure was silently ignored, giving operators no signal that the
refresh was broken.

Also removes a circular self-import (`from src.agent_loop import
_build_base_prompt`) inside _build_system_prompt; the function is
already in scope and the import created a latent circular reference risk.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 12:29:31 +01:00
Giuseppe
68cb715914 fix(endpoint): import ModelEndpoint from core database
ModelEndpoint is defined in core.database, not src.database. The wrong
import silently prevented the module from loading in deployment
configurations that do not have a src/database.py shim, resulting in an
ImportError at startup.

Also adds a warning log when resolve_endpoint finds no usable model
(all models hidden or the list is empty), making the otherwise-silent
failure visible in operator logs.

The test_auth_regressions stub for src.endpoint_resolver was missing the
build_models_url attribute, which caused test collection errors.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 11:51:47 +01:00
Marius Popa
dc365a1b27 Fix Ollama agent single-token responses (#1591)
Agent mode treated local /v1 endpoints, including Ollama on :11434, as native-tool-capable by host/model heuristics. On Ollama's OpenAI-compatible surface some models that advertise tool support stop after a single token when schemas are sent (issue #1567). Default local Ollama /v1 back to fenced tool blocks unless the endpoint explicitly has supports_tools=True.

Also compare both the runtime chat URL and the normalized endpoint base when reading ModelEndpoint.supports_tools. That keeps a saved base URL such as http://localhost:11434/v1 effective when the active session URL is /v1/chat/completions.

Tests: .venv/bin/python -m pytest tests/test_tool_support_heuristic.py
2026-06-04 11:45:10 +01:00
ooovenenoso
e163384015 fix: treat Nix files as readable uploads (#2249) 2026-06-04 12:06:24 +02:00
Nicholai
4dc11cfe6b refactor(memory): canonicalize memory imports (#50) 2026-06-04 05:31:15 +01:00
Dan (cirim)
911fd61100 fix(tool_index): add manage_memory to ALWAYS_AVAILABLE 2026-06-04 14:04:32 +10:00
Yuri
a2e691da2b fix(models): stabilize proxy endpoint refresh behavior
* fix: support large proxy model endpoint refresh

Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.

Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.

Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.

* fix: preserve endpoint ping status semantics
2026-06-04 04:56:11 +01:00