Commit Graph

29 Commits

Author SHA1 Message Date
Duarte Antunes
448401a0fc Harden PDF document markers against cross-owner upload access (#445)
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 22:38:14 +09:00
red person
e1102585bf Fix chat stream recovery and PDF library indexing (#468) 2026-06-01 22:33:35 +09:00
Alexander Kenley
3c6b084f08 Secure by default uplift (#511)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:30:07 +09:00
Afonso Coutinho
c38932e6c6 fix: deep research discards valid sources mentioning cookies/copyright (#481)
* fix: drop over-broad 'cookie'/'copyright' low-quality markers

* fix: detect cookie/copyright boilerplate via phrases, not bare words

* test: keep research findings that merely mention cookies or copyright
2026-06-01 22:26:37 +09:00
Alexander Kenley
07d92556a3 Fix visual report chapter navigation (#505)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:26:13 +09:00
Afonso Coutinho
1eff46579a fix: ChromaDB unreachable blocks app startup for 30-60s (#326) (#476)
* fix: fail fast when ChromaDB is unreachable instead of blocking startup

* fix: only cache the ChromaDB client after a successful heartbeat

* test: cover ChromaDB fast-fail preflight and no-cache-on-failure
2026-06-01 22:22:41 +09:00
pewdiepie-archdaemon
5ed9b74cd0 Polish email tasks and window controls 2026-06-01 20:56:46 +09:00
Afonso Coutinho
3884f2b8b7 Prevent task session delivery NOT NULL crashes
* fix: coerce null endpoint_url when delivering task result to a session

* fix: also coerce null model so the session insert satisfies NOT NULL

* test: cover task session delivery on an empty database
2026-06-01 18:28:48 +09:00
red person
2f87dbcfbc Show a clear message when PyMuPDF is missing 2026-06-01 18:27:17 +09:00
Rifqi Akram
5b1e56407b Add SSRF-guarded web fetch agent tool
* feat(web-fetch): add web_fetch tool to read a specific URL's content

* test(web-fetch): add SSRF coverage and fail closed on empty DNS resolution

Add explicit SSRF regression tests for the web_fetch path covering
loopback, private LAN ranges, link-local/metadata, IPv6 private/local,
redirect-into-private, and unsupported schemes. Harden _public_http_url
to fail closed when a hostname resolves to no addresses.
2026-06-01 16:57:28 +09:00
pewdiepie-archdaemon
be260f43e8 Handle incomplete detached agent streams 2026-06-01 16:54:11 +09:00
Duarte Antunes
e77d87fa80 Enforce owner checks for upload attachments 2026-06-01 16:47:48 +09:00
pewdiepie-archdaemon
0888a3b3e6 Add native Windows compatibility layer 2026-06-01 15:09:47 +09:00
pewdiepie-archdaemon
b998c52dd0 Add Deep Research extraction controls 2026-06-01 14:55:33 +09:00
Alexander Kenley
cb8a0b268d Route calendar action requests to tools
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 14:32:41 +09:00
LittleLlama
7e7e441fec Re-enable VectorRAG init with lazy retry
Personal Docs (POST /api/personal/add_directory and friends) currently
returns HTTP 503 'RAG system is not available' for every request,
because get_rag_manager() and rag_manager are both hardcoded off. The
disablement was added when chromadb 1.4.1 / pydantic 2.12 were mutually
incompatible at the client init layer.

That compat issue is fixed in the current pins (chromadb 1.5.x +
pydantic 2.13.x). Verified by calling the original lazy initializer
against a running chroma server — VectorRAG instantiates, reports
healthy=True, and indexes successfully.

This change:

1. src/rag_singleton.py — replace the hardcoded `return None` in
   get_rag_manager() with the original lazy init body. Keeps the
   30s retry-throttle so a missing chroma server doesn't busy-retry
   on every request.

2. app.py — replace the parallel `rag_manager = None` /
   `rag_available = False` hardcoding with a get_rag_manager() call.
   Logs the resolved state at startup. If chroma isn't reachable yet,
   rag_manager stays None and personal-doc routes still return 503,
   but the *next* request will hit the retry-throttle path in
   get_rag_manager() and try to init again.

Doesn't touch requirements.txt. Repos using docker-compose get chroma
automatically; manual installs that want Personal Docs to work still
need to either pip install chromadb (full package) and run `chroma run`
or point at an external chroma instance via env. That can be a
follow-up README / requirements-optional note.
2026-06-01 14:32:13 +09:00
Fernando Lazzarin
93d3cc49c2 harden(teacher): treat escalation trace as untrusted data (#275)
The teacher-escalation loop distills a failed turn's trace into a
persisted skill, but the trace includes raw tool output (web pages,
emails, retrieved documents) that can carry prompt-injection. Skills are
later injected as authoritative "follow step by step" guidance, so an
injected instruction in tool output could be laundered into a skill the
student follows on a later turn -- bypassing the untrusted-content
wrapper that protects the live turn.

Fence the trace in both teacher prompts and add an explicit "this is
data, not instructions" guard so the teacher won't copy directives out
of tool output into a procedure. Additive prompt hardening; no
default-UX change.

Ran: python -m py_compile src/teacher_escalation.py + a format/fencing
smoke test (both templates format; an injected instruction stays fenced
inside the untrusted block).

Co-authored-by: Fernando Lazzarin <263019791+waitdeadai@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:31:39 +09:00
Alexander Kenley
2c4b8b57dd feat(ai): add OpenRouter and Ollama Cloud providers (#231)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 14:26:10 +09:00
LittleLlama
ec43ba83dd Fix NPX MCP server crash (skip if not installed, alternative shape to #242 / #252) (#253)
* Fix NPX MCP server crash by checking install state instead of timing out

When @playwright/mcp (or any future npx-based built-in server) isn't
already cached, npx tries to download and install it on first invoke.
That can take minutes or hang on a fresh install missing Playwright
system deps. The previous code bounded that wait with
asyncio.wait_for(mcp_manager.connect_server(...), timeout=30), but the
cancellation that wait_for fires on timeout propagates into
mcp.client.stdio.stdio_client's internal anyio task group, which
raises:

    RuntimeError: Attempted to exit cancel scope in a different task
    than it was entered in

The error fires in a sibling background task (Task exception was never
retrieved) so the surrounding try/except BaseException doesn't catch
it, and the orphaned cancel scope cascades cancellations into other
tasks in the same event loop. Running requests start failing and the
process needs a restart.

Fix: detect whether the package is already cached before invoking
connect_server, instead of trying to bound the connect with a timeout.
A new _is_npx_package_cached helper runs:

    npx --no-install <pkg> --version

The --no-install flag makes npx fail fast on a cache miss instead of
downloading, so the probe returns in <500ms either way. If the package
isn't cached, we log a warning with the exact command the user can run
to install it, and skip the server. If it is cached, we call
connect_server normally with no wait_for wrapper, so there's no
cancellation that could enter stdio_client's task group.

This removes the entire bug class instead of papering over it. No
asyncio.wait_for around stdio_client, no shielded-task leak, no
shutdown-time RuntimeError. Verified against current versions
(mcp library on Python 3.14, anyio 4.13.0) with the existing
@playwright/mcp@latest cached, and with a deliberately uncached
package spec to exercise the skip path.

* Make first-run setup explicit when NPX MCP package isn't cached

Per @pewdiepie-archdaemon review on #253:

- src/builtin_mcp.py: expand the skip-server warning into a multi-line
  block with Reason/Impact/Fix/Notes lines, so the message stands out
  in startup logs and clearly tells the user what to run.

- README.md: add 'Built-in MCP servers (optional setup)' subsection
  under Configuration, with the install command and a brief note that
  it's optional and skipped if not cached.
2026-06-01 14:23:19 +09:00
AzaelMew
7023468cea Fix YEARLY recurring CalDAV events only showing on DTSTART year (#179)
* Fix YEARLY recurring CalDAV events only showing on DTSTART year (#170)

Recurring events with RRULE:FREQ=YEARLY only appeared in the calendar
on the year matching DTSTART, not in subsequent years. The list_events
query filtered by , which excludes
recurring events whose original dtend (e.g. 2019-07-22) falls before
the requested window (e.g. 2026).

Fix: split the query into two branches — non-recurring events still
require window overlap, but recurring events (with non-empty RRULE)
are fetched by dtstart < end_dt alone. A new helper,
_expand_rrule_occurrences(), uses dateutil.rrule to expand each
recurring event into individual occurrence dicts within the requested
date range, so YEARLY/WEEKLY/MONTHLY events render correctly across
all years.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* recurrence: compound UIDs, frontend fixes, python-dateutil req, tests

- Replace _expand_rrule_occurrences with _expand_rrule that emits stable
  compound UIDs ({base_uid}::{date_or_datetime}) so the frontend can
  distinguish occurrences from the same series. Non-recurring events
  pass through with is_recurrence=false and series_uid=uid.

- Add _resolve_base_uid() to extract the base series UID from compound
  UIDs — used by PUT/DELETE /api/calendar/events/{uid} and the
  manage_calendar tool so edits/deletes always target the base row.

- Update manage_calendar tool to import and use _resolve_base_uid.

- Frontend _updateEvent / _deleteEvent: detect compound UIDs and
  invalidate localStorage cache after success so stale sibling
  occurrences aren't shown.

- Add python-dateutil to requirements.txt as an explicit dependency.

- Add 14 regression tests in tests/test_calendar_recurrence.py
  covering _resolve_base_uid edge cases, _expand_rrule with
  yearly/weekly/monthly/all-day/bad-rrule, unique UIDs, and
  metadata inheritance.

- Merge upstream's cleaner SQLAlchemy or_/and_ query pattern.

* recurrence: overlapping malformed-RRULE, exclusive end, multi-day crossings

Fix three edge cases in _expand_rrule:

1. Malformed-RRULE fallback now checks window overlap. list_events
   fetches recurring rows with only dtstart < end_dt, so a broken
   old recurring event could appear in unrelated future windows.
   Now fallback returns [] unless the base event's dtstart/dtend
   actually intersect [start, end).

2. Exclusive end boundary. rule.between(start, end, inc=True) was
   inclusive on end, but the route contract and non-recurring SQL
   filter both use [start, end). Added occ_start >= end guard.

3. Multi-day crossings. A recurring occurrence that starts before
   the window but ends inside it was missed (only occ_start was
   checked). Now expands from start - duration and filters by
   occ_start < end AND occ_end > start, matching non-recurring
   overlap behavior.

Tests: +4 tests for these cases (18 total)

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 13:42:44 +09:00
Håkon Julius Størholt
91d3511580 Recognize local vision models so their images aren't dropped (#185)
An image attachment only got through if the model name was on a short
built-in list. Anything else was treated as text-only and the image was
quietly dropped, so the model never saw it. That left out a lot of the
smaller vision models you can run locally (moondream was the one I hit).

Pulled the check into is_vision_model() in chat_helpers, broadened it to
cover those, and added a test. Models that already worked are unaffected.

Fixes #124.
2026-06-01 13:09:21 +09:00
pewdiepie-archdaemon
a66f241e21 Preserve large pasted messages in context 2026-06-01 12:38:35 +09:00
Chat Sumlin
178befddd7 Fix duplicate CalDAV sync UIDs
Track uncommitted CalendarEvent rows during a CalDAV sync batch so duplicate UIDs update the pending row instead of inserting twice.
2026-06-01 02:17:43 +00:00
Juan Pablo Jiménez
4a04068818 Fix vision attachment timeout and stale cache
Increase local vision model timeout and avoid caching transient VL failure placeholders.\n\nCloses #202.
2026-06-01 02:04:46 +00:00
pewdiepie-archdaemon
0e3734a318 Align SearXNG fallback URL 2026-06-01 10:50:07 +09:00
pewdiepie-archdaemon
d9d95b4855 Improve OpenRouter and Groq provider requests 2026-06-01 10:32:14 +09:00
pewdiepie-archdaemon
d026e13a5a Fix provider setup and strip message metadata 2026-06-01 10:20:18 +09:00
pewdiepie-archdaemon
fc7f107b22 Improve Ollama setup and model endpoint handling 2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon
e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00