Commit Graph

132 Commits

Author SHA1 Message Date
ooovenenoso
5e47e69e99 Allow serving cached local llama.cpp models
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-01 23:10:08 +09:00
Afonso Coutinho
9b1acf6612 Fix year extraction in research queries
* fix: extract full year in research query entities, not just the century

* fix: same year capture-group bug in the services search copy

* test: research query extracts the full year
2026-06-01 23:09:41 +09:00
Areon Lundkvist
f853a3fc67 Harden streaming deltas against null payloads 2026-06-01 23:09:17 +09:00
Mikael A
e7d61c724f Let calendar handle Escape while open 2026-06-01 23:08:57 +09:00
Yizreel Schwartz Sipahutar
42380a8693 Keep Cookbook POSIX paths stable on Windows hosts 2026-06-01 23:08:39 +09:00
Steven French
4bbf82c2ab Fix macOS launcher Python path usage 2026-06-01 23:08:20 +09:00
Strahil Peykov
370fe6b501 Warn when localhost auth bypass is enabled 2026-06-01 23:08:01 +09:00
LittleLlama
74dedcad37 Remove duplicate tool index startup warmup
get_tool_index() calls index_builtin_tools() on first init
(src/tool_index.py:469-470), and _warmup_tool_index then calls it
explicitly right after. Every cold boot embeds all 58 built-in tools
twice and double-upserts them into the ChromaDB collection.

The remaining get_tools_for_query call still pre-warms the query path.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 23:07:42 +09:00
pewdiepie-archdaemon
7711e14f90 Polish email reply and task controls 2026-06-01 23:02:25 +09:00
spooky
033852ab14 fix: require GGUF sources for llama downloads (#368) 2026-06-01 22:47:47 +09:00
pewdiepie-archdaemon
f2d55f8726 Fix cached GGUF model metadata in Cookbook Serve 2026-06-01 22:46:54 +09:00
pewdiepie-archdaemon
743c074b2e Harden Cookbook package SSH probe 2026-06-01 22:44:34 +09:00
pewdiepie-archdaemon
e5b927597e Fix Cookbook serve exit code reporting 2026-06-01 22:41:25 +09:00
spooky
15822e91ff fix: keep serve preflight errors visible (#398) 2026-06-01 22:40:06 +09:00
spooky
4b72dd407b fix: report serve dependency readiness (#412) 2026-06-01 22:39:36 +09:00
red person
39cec53284 Normalize setup admin username (#448) 2026-06-01 22:38:56 +09:00
Duarte Antunes
448401a0fc Harden PDF document markers against cross-owner upload access (#445)
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 22:38:14 +09:00
red person
b2e8d692a4 Scope personal RAG uploads by owner (#446) 2026-06-01 22:36:53 +09:00
red person
d36896c5f7 Gate image editor AI endpoints by privilege (#447) 2026-06-01 22:35:24 +09:00
william-napitupulu
758a1824c7 Update Styles.css (#463)
Small update to the styles that bothered me, i noticed in the window/modal for calendar when editing a day the time icons had a mask that overlapped the icon.  I simply added 'background-image: none' prop to it/
2026-06-01 22:34:24 +09:00
red person
e1102585bf Fix chat stream recovery and PDF library indexing (#468) 2026-06-01 22:33:35 +09:00
Filip
92a81480f7 feat: allow memory import without session (#493) 2026-06-01 22:32:17 +09:00
Dr-Shadow
7be4ece224 Allow to customize the render GID to match the one on the host (#515) 2026-06-01 22:31:33 +09:00
Carlos Arroyo
00320972dc fix: CUDA/GPU detection for vLLM and llama.cpp in Docker (#479)
Two bugs caused GPU inference to silently fall back to CPU inside the
Odysseus Docker container even when the GPU was correctly passed through.

## entrypoint.sh — CUDA_HOME detection only covered CUDA 13.x wheels

The nvcc glob only searched
vidia/cu13, which matches the

vidia-nvcc-cu13 pip wheel layout. CUDA 12.x wheels install nvcc to

vidia/cuda_nvcc/bin/nvcc (nvidia-cuda-nvcc-cu12) or
vidia/cu12
(nvidia-nvcc-cu12) — completely different paths. The glob found nothing,
so CUDA_HOME was never set.

Worse, VLLM_USE_FLASHINFER_SAMPLER=0 was inside the same if-block, so it
was never set either. vLLM then tried to JIT-compile the FlashInfer
sampler at startup, failed with 'Could not find nvcc', and crashed — even
though the GPU was fully visible to the container.

Fix: expand the search to also check nvidia/cu12 and nvidia/cuda_nvcc.
Move VLLM_USE_FLASHINFER_SAMPLER=0 to an unconditional export after the
loop (it is sampler-only, no impact on the attention path, and the correct
setting for any container where CUDA headers may be incomplete).

## cookbook_routes.py — llama.cpp Linux source build silently fell back to CPU

The cmake invocation was:
  cmake -B build -DGGML_CUDA=ON 2>/dev/null || cmake -B build

2>/dev/null suppressed all configure errors. When nvcc is absent (the
slim base image has no CUDA toolkit — intentional), cmake fails silently,
then the || fallback re-runs without -DGGML_CUDA=ON. A CPU-only binary is
produced with no warning. Additionally, a stale CMakeCache.txt from the
failed CUDA attempt was reused (no rm -rf build), poisoning the next
configure run. The macOS branch already did rm -rf build for exactly this
reason; the Linux branch did not.

Fix: before cmake, detect pip-installed nvcc across the same three path
patterns as entrypoint.sh and expose it via CUDA_HOME/PATH. If nvcc is
found, run a clean CUDA build with full error visibility. If not, fall
back to a CPU build with an explicit warning telling the user how to get
a GPU build (install vLLM via Cookbook -> Dependencies, which brings the
CUDA wheels including nvcc, then re-launch).

## .env.example — document Windows COMPOSE_FILE separator

Added a comment showing the semicolon separator required on Windows
Docker Desktop alongside the existing colon-separator (Linux) example.
2026-06-01 22:30:51 +09:00
Alexander Kenley
3c6b084f08 Secure by default uplift (#511)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:30:07 +09:00
roxsand12
766ddcaa99 fix: add _setup_lock to prevent race condition in first-run setup (#508) 2026-06-01 22:29:03 +09:00
Sanjay Davis
508fabcb3b Restore dependency refresh after install AND persist safe download mode on retries. (#499) 2026-06-01 22:28:06 +09:00
Afonso Coutinho
c38932e6c6 fix: deep research discards valid sources mentioning cookies/copyright (#481)
* fix: drop over-broad 'cookie'/'copyright' low-quality markers

* fix: detect cookie/copyright boilerplate via phrases, not bare words

* test: keep research findings that merely mention cookies or copyright
2026-06-01 22:26:37 +09:00
Alexander Kenley
07d92556a3 Fix visual report chapter navigation (#505)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-01 22:26:13 +09:00
vidvuds
6ad617931d Fix import-review list not scrolling in Brain modal (#509)
The memory import-review list (.memory-suggestions) is shown inside the
overflow:hidden .admin-card but, unlike the sibling .memory-list, it had
no scroll bounding of its own (no flex:1 / min-height:0 / overflow-y).
A long review list therefore grew past the card and was clipped, leaving
lower entries and their controls unreachable with no usable scroll area.

Give .memory-suggestions the same flex:1 + min-height:0 + overflow-y:auto
bounding the memories list already uses so the review list scrolls
internally within the modal. Pin the review header (the title and the
save all / back controls) with position:sticky so they stay visible while
the items scroll under them, and add a small scrollbar gutter so the bar
does not sit flush against the item cards.

Fixes #455
2026-06-01 22:25:16 +09:00
Cosmin Enache
04fd963394 Fix duplicate compare modal on repeated clicks (#491)
Co-authored-by: cosminae <cosmin.e@annavas.io>
2026-06-01 22:24:27 +09:00
Afonso Coutinho
1eff46579a fix: ChromaDB unreachable blocks app startup for 30-60s (#326) (#476)
* fix: fail fast when ChromaDB is unreachable instead of blocking startup

* fix: only cache the ChromaDB client after a successful heartbeat

* test: cover ChromaDB fast-fail preflight and no-cache-on-failure
2026-06-01 22:22:41 +09:00
Jamieson O'Reilly
171c29dcf3 Fix email-thread HTML injection, attachment path traversal, and missing authz (#475)
Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):

- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
  inbound HTML through the allowlist sanitizer, but the two threaded paths
  (_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
  server-parsed `body_html` raw into the DOM. A crafted inbound email could
  inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
  if a deployment relaxes the script CSP). Now sanitized on all paths.

- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
  on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
  user-controlled folder/uid and no containment, so a folder like `../../tmp`
  could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
  single safe segment and asserts containment.

- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
  /api/rag/stats, /api/test/youtube, /api/test-research relied only on the
  global session check (any logged-in user). Now require_admin-gated.

- Defense-in-depth HTML escaping: session HTML export escapes the session name
  (routes/session_routes.py); the MCP OAuth page escapes the reflected Host
  header / server_id (routes/mcp_routes.py).

- Internal-tool token now compared with secrets.compare_digest (constant time)
  in core/middleware.py and app.py.

Adds regression tests in tests/test_security_regressions.py.
2026-06-01 22:20:17 +09:00
Abhinav
9e8de43f25 fix: clear session headers on endpoint deletion (#477) 2026-06-01 22:19:54 +09:00
pewdiepie-archdaemon
5ed9b74cd0 Polish email tasks and window controls 2026-06-01 20:56:46 +09:00
red person
5c390d6b3e Fix sidebar brand text clipping (#362) 2026-06-01 19:04:08 +09:00
red person
fd2ea71cec Clarify first-run admin login 2026-06-01 18:59:24 +09:00
Ryan
5de7afd696 Create search cache directory in Docker image 2026-06-01 18:38:37 +09:00
Sirsyorrz
9955f5bc95 Fix VRAM estimates for pre-quantized HF repos
The Cookbook fit scanner was reporting impossibly low VRAM requirements
for some pre-quantized models — e.g. cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit
shown as 7.1 GB ('perfect' on a 12 GB card) when the real load is ~40 GB.

Root cause is in the catalog builder. When _entry_from_modelinfo falls
back to safetensors metadata for the parameter count, it stored
safetensors.total directly. For pre-quantized repos that figure reflects
*packed* element counts: AWQ/GPTQ-Int4 pack 8x 4-bit weights into one
I32, AWQ-8bit/GPTQ-Int8/FP8 pack 4x. The catalog therefore recorded
~1/8 of the real parameter count, and min_vram_gb = packed * bpp
double-applied the quantization.

Fix the safetensors fallback:

* prefer the per-dtype parameters dict when available and unpack only the
  I32/I64 entries (the F16/BF16 scale/zero tensors and embeddings are
  already at their real element counts)
* fall back to total * pack_factor when only total is exposed

Patch the catalog entries that were affected by the old fallback so the
fit ratings reflect reality without waiting for a full catalog rebuild:

* cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit  11.4B -> 79.7B (40.8 GB VRAM)
* stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ  4.6B -> 30.5B
* stelterlab/NVIDIA-Nemotron-3-Nano-30B-A3B-AWQ  5.1B -> 30.5B
* warshanks/Qwen3-8B-abliterated-AWQ  2.2B -> 8.2B
* QuantTrio/sarvam-30b-AWQ  7B -> 30B
* QuantTrio/sarvam-105b-AWQ  19B -> 105B

Closes #377.
2026-06-01 18:32:58 +09:00
Afonso Coutinho
16d6484492 Keep Cc recipients in reply-all
* fix: populate window._myEmailAddress from the active email account

* fix: keep Cc recipients in reply-all when own address is empty or unknown

* test: cover reply-all recipient building (issue #360)
2026-06-01 18:29:22 +09:00
Afonso Coutinho
3884f2b8b7 Prevent task session delivery NOT NULL crashes
* fix: coerce null endpoint_url when delivering task result to a session

* fix: also coerce null model so the session insert satisfies NOT NULL

* test: cover task session delivery on an empty database
2026-06-01 18:28:48 +09:00
Miles
df7d32c70c Require document privilege for PDF imports 2026-06-01 18:28:15 +09:00
red person
2f87dbcfbc Show a clear message when PyMuPDF is missing 2026-06-01 18:27:17 +09:00
Rifqi Akram
5b1e56407b Add SSRF-guarded web fetch agent tool
* feat(web-fetch): add web_fetch tool to read a specific URL's content

* test(web-fetch): add SSRF coverage and fail closed on empty DNS resolution

Add explicit SSRF regression tests for the web_fetch path covering
loopback, private LAN ranges, link-local/metadata, IPv6 private/local,
redirect-into-private, and unsupported schemes. Harden _public_http_url
to fail closed when a hostname resolves to no addresses.
2026-06-01 16:57:28 +09:00
Daniel Grzelak
92c2392fd6 Clarify Docker dependency status inside containers
* fix: show docker as N/A inside the container

* test: cover in-container docker detection

* fix: make the N/A dependency chip legible

* refactor: make remote docker applicability explicit and tested
2026-06-01 16:56:42 +09:00
Boody
dea917b23f Clarify setup admin login instructions
* fixed confusing credentials prompt

* fix(setup): return status from create_default_admin function

* fix(setup): initialize admin creation status in main function

* fix(setup): enhance admin creation feedback and status handling

* Enhance admin user login messages with conditional feedback based on creation status

* Refine admin user creation feedback messages for clarity and actionability and formatted code

* Add fallback error message for admin creation failure in setup script
2026-06-01 16:55:42 +09:00
red person
c9c6b919ff Fix database stubs in regression tests (#301)
* Fix database stubs in regression tests

* Keep regression tests independent of SQLAlchemy

---------

Co-authored-by: red <red@red-MacBook-Air.local>
2026-06-01 16:55:09 +09:00
pewdiepie-archdaemon
be260f43e8 Handle incomplete detached agent streams 2026-06-01 16:54:11 +09:00
Duarte Antunes
e77d87fa80 Enforce owner checks for upload attachments 2026-06-01 16:47:48 +09:00
Nico Panu
8874a11baf Gate Cookbook quick run on downloaded models
Gate Cookbook "Run" on the model being downloaded
The What-Fits tab's quick "Run" button launched a serve task even when
the model was not downloaded. It POSTed directly to /api/model/serve and switched to the Running tab, so vLLM/SGLang would background-pull at launch (and llama.cpp just errors "No GGUF found") while the task showed as "running" without actually serving anything.
The Configure button and the Serve tab already gate on the cached-model
list; quick-Run did not. Mirror that gate: when the model isn't cached,
honor the button's "Download" half by kicking off the download instead of spawning a phantom serve task, and toast the user to Run again once it finishes.
2026-06-01 16:46:24 +09:00