886 Commits

Author SHA1 Message Date
afonsopc
28b296a712 Fix auto-memory vector dedup dropping a user's fact on cross-tenant match
extract_and_store dedups each extracted fact against the vector store
before the (owner-scoped) text fallback. The vector store is a single
shared ChromaDB collection storing only {"source": "memory"} — no
owner — and find_similar queries it with no owner filter, so it can
return a memory_id belonging to a different tenant. The old code
continue'd (skipped storing) on any vector hit without checking
ownership, so when ChromaDB is healthy (the common path) a user's
freshly-extracted fact was silently dropped because it was merely
semantically similar to another user's memory — the text fallback that
IS owner-scoped never ran. Gate the skip on the matched memory being
this user's own (or legacy unowned), mirroring the text dedup predicate;
cross-tenant or stale matches fall through. Same bug class as #1743.
2026-06-04 23:45:13 +01:00
Alexandre Teixeira
23fb5e169a fix(tests): make cookbook venv fallback test deterministic
Makes the cookbook venv fallback-chain test deterministic by simulating the inside-venv shell state directly instead of depending on the GitHub runner Python environment. Final focused #2580 CI-baseline cleanup.
2026-06-04 23:35:34 +01:00
Alexandre Teixeira
795782917f fix(tests): call live tool_execution module in edit-file gate test
Calls execute_tool_block through the live src.tool_execution module in the edit-file admin-gate test so the monkeypatched _owner_is_admin seam and the called function belong to the same module object. Fixes the scoped #2580 CI-order edit-file failure. Remaining Python failure is the unrelated cookbook fallback-chain environment test.
2026-06-04 23:22:02 +01:00
Isaiah Gardner
134c608466 fix: degrade missing/None content key in system messages to empty string (#2570) 2026-06-05 00:10:11 +02:00
Kenny Van de Maele
2be3779e6e feat: Add workspace: confine agent tools to a folder (#1103)
* feat: Add workspace: confine agent tools to a folder

Pick a server folder as the agent's workspace so its file/shell tools work
there and don't touch files outside it. File tools are hard-confined; bash/
python run with cwd set to the folder.

Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` /
`clear` / `pick` (open the directory browser).

- routes/workspace_routes.py: GET /api/workspace/browse (admin-only).
- src/tool_execution.py: hard path confinement for read_file/write_file;
  bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block.
- src/agent_loop.py: workspace note prepended to the system prompt.
- static/: overflow menu item, input-bar pill, directory-browser modal, and
  the /workspace slash command.
- tests/test_workspace_confine.py.

* Wire workspace confinement into tools that landed after this PR

edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was
written, so they bypassed the workspace boundary. Thread the workspace through:
  - edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace
  - grep/glob/ls: _resolve_search_root confines to the workspace (root + paths)
  - bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir
    default when no workspace is set)
Tests cover edit_file + grep/ls confinement (inside ok, outside rejected).

* Workspace picker: editable path bar + modal style cohesion + cross-platform hardening

- Make the current-folder strip an editable address bar: type/paste a full
  path and press Enter to navigate (also reaches other Windows drives and
  hidden dirs the up-only browser cannot).
- Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn*
  in favour of base .modal-content/.modal-body and the .confirm-btn button
  family; separators/hover use var(--border). Net -31 CSS lines.
- Fix the path field overflowing the modal right edge (flex stretch + margin
  vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin.
- Cross-platform confinement: normcase the workspace commonpath check so
  containment holds on case-insensitive filesystems (Windows/macOS).
- Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd()
  instead of pwd. 5 pass.
2026-06-05 00:06:37 +02:00
Kenny Van de Maele
7b4365fe57 Make write_file/edit_file always-available like read_file (#2684)
read_file/grep/glob/ls are in ALWAYS_AVAILABLE but the on-disk write tools
(write_file, edit_file) were only surfaced via per-query tool-RAG retrieval.
On a bare 'edit X' request the retriever could miss them, so the model was
never offered edit_file/write_file and wrongly fell back to edit_document
(editor panel) or improvised with bash sed. Add both to ALWAYS_AVAILABLE
next to read_file; they stay admin-gated by tool_security so non-admin
exposure is unchanged.

Fixes #2683
2026-06-05 00:02:14 +02:00
pewdiepie-archdaemon
a260e0abd4 Revert calendar-based cookbook scheduler
Reverts b98ee04 + 4ed48ba + a19b6d2.

Calendar events turned out to be the wrong abstraction for scheduling model serve windows. Pivoting to the existing ScheduledTask infrastructure (cron / daily / weekly recurrence, next_run tracking, edit-from-Tasks-tab UI) in a follow-up commit. The ScheduledTask path:

  - reuses dispatch logic the rest of the app already understands
  - drops the calendar dependency entirely (no auto-created "Cookbook" calendar, no calendar.js hook)
  - shows up in the Tasks UI that already exists for everything else

What this revert removes:
  - src/cookbook_scheduler.py — calendar reconciler
  - routes/cookbook_schedule_routes.py — /api/cookbook/schedule/* endpoints
  - static/js/cookbookSchedule.js — Schedule modal / settings card
  - cookbook_scheduler_enabled + cookbook_schedule_calendar_href settings keys
  - The window.cookbookOpenScheduleForm hook in calendar.js
  - The Schedule button + paired-button CSS in cookbookServe.js + style.css
2026-06-05 06:57:21 +09:00
Alexandre Teixeira
fb852bd62e fix(tests): restore webhook manager after review test import
Restores src.webhook_manager after a review-regression test imports it against a fake src.database. Fixes one focused #2580 CI-baseline pollution bucket.
2026-06-04 22:28:00 +01:00
Michiel Van de Velde
7ddc5eaef4 Merge pull request #2529 from NubsCarson/codex/2509-mcp-tool-input-params
fix(mcp): expose MCP tool input parameters to the agent
2026-06-04 23:07:42 +02:00
Alexandre Teixeira
70812955d1 fix(tests): restore core module attrs in session owner test
Restores core.database/core.models/core.session_manager parent package attributes after session-owner test import stubs. Fixes one focused #2580 CI-baseline pollution bucket.
2026-06-04 21:43:25 +01:00
Kenny Van de Maele
64d65b73c1 feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)
* feat: round-limit handling — Continue affordance at the cap + configurable cap

When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:

1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
   a "Continue" pill at the bottom of the chat that resumes the task from where
   it left off. Repeated cap-hits each get a fresh Continue (multiple continues
   in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
   validated on the client, at the save endpoint, and at the read site.

- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
  tool-executing round completes on the last allowed round — i.e. the agent
  wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
  `max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
  agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
  existing .stopped-indicator / .continue-btn classes and theme vars
  --border/--fg/--bg/--accent); appended to the chat container so it survives
  the message re-render at stream finalize. chat.js cache version bumped.

* test: cover rounds_exhausted emission (cap-hit vs normal finish)

Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
2026-06-04 22:36:05 +02:00
Alexandre Teixeira
a54f41037d fix(tests): restore src.database after webhook import
Restores both sys.modules and parent src.database package state after the webhook SSRF tests import src.webhook_manager against the real database module. Fixes one focused #2580 CI-baseline pollution bucket.
2026-06-04 21:21:51 +01:00
Alexandre Teixeira
3426e0cb5e fix(tests): isolate session route import stubs
Keeps src.request_models real and restores both sys.modules and parent routes.session_routes package attributes after temporary test stubs. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 21:05:52 +01:00
Ocean Bennett
e69298888b fix(history): block compact during active runs (#2635) 2026-06-04 21:50:16 +02:00
Kenny Van de Maele
67782e684e fix: exclude slash-command/setup messages from LLM context (#2634) (#2640)
Slash-command replies and the echoed /setup command are persisted to session
history so they render in the transcript, but they are UI chatter the user
never meant as conversation. They were sent to the model on the next turn,
which then commented on '/setup ...' and exposed transient values (e.g. the
Copilot device user_code) to the LLM.

- get_context_messages() (the LLM-API view) now skips messages tagged
  metadata.source == 'slash'. Display/history-load paths use raw history and
  are unaffected.
- slashCommands.js tags the echoed user command with source:'slash' too (the
  assistant replies already carried it); the user line was the one untagged
  path that still reached context.

Fixes #2634.
2026-06-04 21:42:23 +02:00
Afonso Coutinho
baf9179d94 Fix truncate_messages persisting an inflated message_count (#2052)
truncate_messages deletes db_messages[keep_count:] (a no-op when
keep_count >= the real message total) then unconditionally wrote
db_session.message_count = keep_count. When keep_count exceeds the
number of messages that actually exist — e.g. the manage_session AI
tool defaults keep_count to 10, and the HTTP truncate endpoint passes
any client value — the persisted count is set too high (10 on a
3-message session), diverging from the real row count. That column
gates lazy DB-hydration in get_session (message_count > 0) and is
surfaced to the history UI, so it is correctness-relevant. Clamp to
min(keep_count, len(db_messages)); the in-memory slice already caps
naturally.
2026-06-04 21:19:16 +02:00
Giulio Zelante
a8d0c117bb fix(docker): opt-in INSTALL_OPTIONAL build arg for AGPL extras (#2633)
Default image installs requirements.txt only. Set INSTALL_OPTIONAL=true
at build time to add requirements-optional.txt (PyMuPDF, markitdown, etc.)
without baking AGPL into the standard distributed image.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-04 21:15:44 +02:00
Kenny Van de Maele
1cd0aa2b8c feat(provider): add GitHub Copilot provider with device-flow auth (#1480)
* feat(provider): add GitHub Copilot provider with device-flow auth

Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5,
Claude, Gemini, …) work through the normal chat + agent loop, incl. native
tool calling and vision.

Auth is one-click via the GitHub OAuth device flow; the access token is stored
as the endpoint's (encrypted) api_key and sent directly as `Authorization:
Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to
the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only
provider-specific behaviour is a small set of required request headers,
injected centrally.

Sign-in is available from Settings → model endpoints ("Connect GitHub
Copilot") and from chat via `/setup copilot`.

- src/copilot.py (new), routes/copilot_routes.py (new): constants, header
  builders, device-flow start/poll, model discovery, owner-scoped endpoint
  provisioning.
- src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers,
  per-request x-initiator/vision.
- src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas.
- src/model_context.py: known context windows for Copilot (no unauthenticated
  /models probe).
- static/, README, tests/test_copilot*.py.

* Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process
2026-06-04 21:13:14 +02:00
Ocean Bennett
ca32b43b38 fix(history): tolerate tool-call turns during compact (#2626) 2026-06-04 20:59:41 +02:00
Maruf Hasan
24220155af chore: remove orphaned static/landing.html (superseded by docs/index.html) (#2632) 2026-06-04 20:55:51 +02:00
Vykos
9964f1382f Isolate HTML popup openers (#2501) 2026-06-04 20:52:41 +02:00
Vykos
ca8ca38a32 Guard image and QR DOM attributes (#2500) 2026-06-04 20:51:23 +02:00
Vykos
b59bbe80ce Harden chat streaming DOM sinks (#2498) 2026-06-04 20:49:37 +02:00
Vykos
e113c10d01 Harden email HTML URL sanitization (#2496) 2026-06-04 20:47:47 +02:00
Vykos
01c99c3990 Harden markdown raw HTML sanitization (#2497) 2026-06-04 20:46:10 +02:00
Vykos
3ae89599f3 Whitelist research source links (#2499) 2026-06-04 20:41:35 +02:00
Afonso Coutinho
ed933ac232 fix: renaming a user leaves their API tokens resolving to the old owner (#1932)
* fix: renaming a user leaves their API tokens resolving to the old owner

* Drive rename token-cache test through the real auth resolver instead of patching a closure
2026-06-04 20:37:59 +02:00
Alex Little
33425a9c6c fix(ui): modal drag + removed startDrag func (#2430)
* fixed

* removed legacy startDrag fc, unified modal dragging

* fixes post feedback
2026-06-04 20:34:18 +02:00
ooovenenoso
ab5311c44d fix(research): support timeout defaults in direct tests (#2624)
fix(research): honor planning query timeouts
2026-06-04 20:23:17 +02:00
Giuseppe
6d511f6e66 fix(llm): auto-detect <think> in content stream for unregistered thinking models (#2588)
* fix(llm): auto-detect <think> in content stream for unregistered thinking models

_THINKING_MODEL_PATTERNS only covers known model families by name. Qwen3-derived
models with non-standard names (e.g. Qwopus, custom QwQ forks) are not matched,
so their <think>...</think> content streams through as visible chat text instead
of being routed to the thinking display.

When the first content delta opens with <think> and the model was not already
identified as a thinking model, dynamically flag the stream as a thinking model
for the remainder of the response. This enables the existing </think> repair path
(line below) and ensures the frontend receives the full <think>...</think> wrapper
it needs to split thinking from the final answer.

The check is restricted to the very first content delta (_first_content_sent is
False) to avoid misidentifying models that happen to write "<think>" mid-answer.

Fixes #2225
Related: #2420 (covered by separate PR from @AmmarS-Analyst), #2224 (@RaresKeY)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace inert _thinking_model flag with _in_think_tag state machine

The original auto-detect set _thinking_model=True on the first <think> chunk
but still emitted it as a regular delta and set _first_content_sent=True
immediately, so no subsequent chunk could enter the repair path.

Replace with _in_think_tag bool: enter thinking mode when first content starts
with <think>, route all chunks to the thinking channel until </think> is found,
then the tail becomes the first regular delta. Adds three regression tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(llm): replace _first_content_sent guard with _think_open_stripped

Opening-tag stripping used `not _first_content_sent` as the guard, but
_first_content_sent stays False throughout the entire think block (it only
flips when regular content is emitted). So `find(">")` ran on every
reasoning chunk — not just the first — and silently truncated everything
before the first ">" in any reasoning text containing comparisons, arrows,
or code.

Fix: add `_think_open_stripped = False` alongside `_in_think_tag`. Use it
as the strip guard in both the "still inside <think>" path and the
"</think> found in same chunk" split path. Set it True once the opening
tag is consumed so all subsequent chunks reach the thinking channel
unmolested.

Add regression test: 3-chunk stream where the middle chunk contains
"c > d" — confirms "more c " is not dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:18:19 +02:00
Alexandre Teixeira
0ead3a4eb2 fix(tests): isolate compare endpoint owner-scope test
Removes module-level core.database stubbing from the compare endpoint owner-scope regression test and patches ModelEndpoint per test with monkeypatch. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 19:17:15 +01:00
Giuseppe
dd707ddb1e fix(agent): default bash/python cwd to data/ to prevent ephemeral file loss (#2586)
Agent subprocesses (bash, python) previously inherited the container's default
working directory (/app), so files created with relative paths landed in the
ephemeral container layer and were silently destroyed on any docker compose up
--build or container recreation.

Set cwd=_AGENT_WORKDIR (resolved to <repo_root>/data at import time) and
HOME=_AGENT_WORKDIR on both subprocess launchers so that:
- pwd inside a bash tool returns the persistent data directory
- relative paths and ~ resolve to a location that survives rebuilds
- the agent can still cd to any absolute path it needs

The resolution uses pathlib.Path(__file__).parent.parent / "data", which
works for both Docker (/app/src → /app/data) and manual installs
(<repo>/src → <repo>/data) without requiring a new env var or compose change.

Fixes #2512

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 20:16:04 +02:00
Zen0-99
7188737294 fix(hwfit): filter non-GGUF models on Windows (#2530)
Odysseus only supports llama.cpp on Windows (vLLM/SGLang are
explicitly blocked). llama.cpp requires GGUF, so AWQ/GPTQ/FP8
safetensors models without a GGUF alternate should not be
recommended in the Cookbook on Windows hosts.

Changes:
- hardware.py: add 'platform': 'windows' to _detect_windows()
  so downstream logic can identify Windows hosts.
- fit.py: include is_windows in the existing GGUF-only filter
  alongside apple_silicon and consumer_amd.
- tests: add test_hwfit_windows.py with regression tests.

Fixes #122, #614 (root cause: unservable models recommended).
2026-06-04 20:02:13 +02:00
pewdiepie-archdaemon
b98ee04e2f Cookbook scheduler: reuse the standard calendar event card + auto-create Cookbook calendar
Drop the custom Schedule modal in favor of opening the calendar's existing event-creation form pre-filled with the model's name + cookbook YAML in the description. The user lands in the same event editor they already know from regular calendar use, just pointed at the auto-created "Cookbook" calendar.

Backend:
  - POST /api/cookbook/schedule/ensure-calendar — idempotent: creates a calendar named "Cookbook" if one doesn't exist for the current user, saves its href into cookbook_schedule_calendar_href, flips cookbook_scheduler_enabled on. Verifies the saved href against /api/calendar/calendars on every call so a manually-deleted calendar self-heals.

Frontend:
  - calendar.js: expose window.cookbookOpenScheduleForm(draft) which opens the calendar modal (if not open), calls _showEventForm, then pre-fills summary / description / rrule / calendar dropdown. Force-expands the "Add details" section so the user can see which calendar it's heading into.
  - cookbookSchedule.js: Schedule-button click now calls ensure-calendar, builds the cookbook: YAML block, and routes to window.cookbookOpenScheduleForm instead of openModal(). The legacy custom modal stays as a fallback for the case where calendar.js hasn't loaded.

UX tweak:
  - cookbookServe.js: replace the standalone "Schedule…" text button with a small icon-only button (clock SVG) glued to the right edge of Launch. The pair forms one visual unit — Launch on the left, schedule-now on the right — sharing a thin divider. CSS handles the rounded corners + divider.
2026-06-05 02:52:07 +09:00
Afonso Coutinho
abe04436a0 fix: merge-last-assistant deletes tool/system rows from the DB (history desync) (#1929) 2026-06-04 19:47:08 +02:00
Giuseppe
bc83479f94 fix: bool('false') is True coerces endpoint toggles incorrectly (#2361)
Python's bool('false') returns True because the string is non-empty.
A JS client serialising a boolean as the string 'false' would have
supports_tools or is_enabled silently flipped to True — so 'disable
tool support' would actually enable it.

Use an explicit lookup dict for supports_tools and a case-insensitive
string check for is_enabled so both string and native bool inputs are
handled correctly.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:43:38 +02:00
pewdiepie-archdaemon
4ed48baf68 Cookbook scheduler: inline settings card at the top of the Cookbook tab
The earlier scheduler commit shipped the backend + Schedule modal but left the feature dormant — no way to toggle it from the UI. This adds the missing knob:

* DEFAULT_SETTINGS gains `cookbook_scheduler_enabled` (False) and `cookbook_schedule_calendar_href` ("") so `/api/auth/settings` POST will actually persist them. Without this, the POST silently dropped unknown keys.

* cookbookSchedule.js gains a self-contained settings card injected at the top of the Cookbook tab body whenever the cookbook modal opens. Card contents:
  - Enable toggle (writes cookbook_scheduler_enabled)
  - Calendar dropdown populated from /api/calendar/calendars (writes cookbook_schedule_calendar_href)
  - Status line: off / pick-a-calendar / N scheduled in next 24h · M running now · K skipped
  - "Reconcile now" button that POSTs /api/cookbook/schedule/reconcile-now

* The same module reveals/hides the Schedule… buttons on serve panels whenever the feature flag changes, so toggling on immediately surfaces the schedule UI without a refresh.

Settings UI lives in cookbookSchedule.js (not settings.js) so the entire scheduler surface — backend, reconciler, modal, settings — collapses cleanly: delete src/cookbook_scheduler.py + routes/cookbook_schedule_routes.py + static/js/cookbookSchedule.js, drop the two DEFAULT_SETTINGS keys, and the two app.py registration lines, and the feature is gone.
2026-06-05 02:40:35 +09:00
Alexandre Teixeira
40cbfb7b94 fix(tests): align gallery owner filter null-user expectation
Updates the stale gallery owner-filter null-user test to match current single-user/auth-disabled behavior. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:39:45 +01:00
Giuseppe
531f426557 fix: KeyError on missing 'content' key in system messages (#2362)
A system message that arrives without a 'content' key — possible via
malformed tool results — raised a KeyError in the hot path of llm_call,
llm_call_async, and stream_llm. Replace m["content"] with
m.get("content") or "" in all three functions so a missing key degrades
to an empty string instead of crashing.

Also removes a redundant .rstrip() after .strip() in _model_activity_key.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:38:45 +02:00
Giuseppe
ff8f9f2188 fix: llm_call_async does not retry on HTTP 429/502/503/504 (#2364)
The retry loop raised immediately for any non-success HTTP response
regardless of attempt count. For transient upstream errors (rate limit,
bad gateway, gateway timeout) the function should back off and retry
within the existing attempt budget.

Also lets ConnectError / ConnectTimeout retry when the host has not been
cooled and attempts remain, instead of always raising on the first
connect failure.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 19:35:55 +02:00
pewdiepie-archdaemon
a19b6d2d4d Cookbook scheduler: calendar events drive model serve windows (experimental, feature-flagged)
Add a calendar-driven scheduler so a user can pick a model in Cookbook, click "Schedule…" instead of "Launch", choose time windows + days of the week + (optional) end date, and have Odysseus auto-launch the serve when the window starts and hard-kill it when the window ends. The calendar IS the source of truth — events on a designated calendar are interpreted as serve schedules, so editing the event in the calendar UI immediately changes the schedule.

Whole feature is gated by setting `cookbook_scheduler_enabled` (default False). Disabling the setting silences the reconciler and the API refuses requests; setting + three new files = entire surface, easy to revert.

New files:
  - src/cookbook_scheduler.py — background reconciler: ticks every 60s, reads next ±90s of calendar events on the designated calendar, launches/kills serves to match. Honors "refuse if GPUs busy" (skips with reason, no retry). Adopts pre-existing manual serves matching the event's model so window-end cleanup still applies. Tags scheduler-owned tasks with `_scheduledBy: <event_uid>` so it never kills serves it doesn't own.
  - routes/cookbook_schedule_routes.py — POST /api/cookbook/schedule/from-cookbook builds RRULE+ICS events from the modal's input (model, slots[], days[], until). GET /upcoming returns the next 24h with per-event status (scheduled / running / adopted / skipped / failed / ended) for the UI. POST /reconcile-now manually kicks the reconciler.
  - static/js/cookbookSchedule.js — Schedule button click handler + modal. Daily/hourly time slot picker, multi-slot ("+ add another time slot"), weekday chips with Weekdays/Weekend/Every-day quicksets, optional Until date. Calls /from-cookbook on save. Whole module is a single IIFE; deleting the file plus its <script> tag removes the UI surface.

Existing files touched (minimal):
  - app.py: register the new router + add the reconcile loop as a startup task (~10 lines, all in one block). Reconcile loop checks the feature flag on every tick, so leaving it running with the flag off costs ~one settings lookup per minute.
  - static/index.html: one new <script> tag for cookbookSchedule.js.
  - static/js/cookbookServe.js: add a "Schedule…" button next to the existing Launch button. Hidden by default; cookbookSchedule.js reveals it after confirming the feature flag is on.
  - static/style.css: ~80 lines for the modal styles (mobile-aware via @media).

User choices baked in:
  - Calendar events are the source of truth.
  - Refuse to launch if GPUs busy (skip + log reason in scheduler.events[uid].reason).
  - Hard kill at event end.
  - No retry on a skipped event within the window.
  - Multi-slot per day supported (one calendar event per slot, shared RRULE).
  - Pre-existing manual serves get adopted at window start so they're killed at end.

Known follow-ups (not in this commit):
  - Settings UI to pick the schedule calendar + toggle the feature flag.
  - Calendar event color/badge for status (running/skipped/failed).
  - "Lazy launch on first request" — currently launches at event start. Replacing _launch_serve with a proxy that defers vllm until the first chat request is a contained future change.
2026-06-05 02:35:23 +09:00
RaresKeY
c12c2aa233 fix: normalize Gemma 4 thought-channel output (#2224) 2026-06-04 19:26:58 +02:00
Alexandre Teixeira
7ce6ec7f50 fix(tests): use line-level PDF marker assertion
Updates the PDF marker regression test to check corrupted markers at line level instead of using a broad substring assertion. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:20:41 +01:00
WasserEsser
20cc23c9bd fix(models): make pinned models visible in chat UI (#2481)
Two bugs prevented pinned models from appearing in the chat model picker:

1. _fetch_models() only used _cached_model_ids(), ignoring pinned_models.
   Since Fireworks AI doesn't list kimi-k2p6-turbo in /v1/models, the
   cached list was empty, so the endpoint showed as offline with no models.

2. _curate_models() filtered unknown pinned IDs into models_extra, but the
   chat UI only reads models (primary list). Pinned models stayed invisible.

Fix: use _visible_models() to merge cached + pinned, then promote pinned
IDs from models_extra to models so they appear in the dropdown.

Closes #1521 follow-up
2026-06-04 19:17:37 +02:00
Ocean Bennett
34c9a8adb1 docs: point PR checklist at dev (#2594) 2026-06-04 19:15:08 +02:00
Alexandre Teixeira
8bc16ef245 fix(tests): use non-repeating split chunk fixture
Updates the split_chunks containment regression test to use deterministic non-repeating records instead of a repeating fixture that could produce accidental substring matches. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 18:11:42 +01:00
nubs
050283c145 fix(mcp): confine oauth file paths (#2272) 2026-06-04 19:10:23 +02:00
nubs
935eb05c63 refactor(search): make src analytics a service shim (#2264) 2026-06-04 18:57:24 +02:00
Alexandre Teixeira
3b292403dc fix(tests): accept verify in endpoint HTTP mocks
Updates endpoint/model-route test HTTP mocks to accept the verify keyword argument passed by endpoint probing code. Restores one focused part of the Python CI baseline tracked in #2580.
2026-06-04 17:53:18 +01:00
Kenny Van de Maele
1f00fff837 feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)
Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
2026-06-04 18:37:32 +02:00