* fix: extract full year in research query entities, not just the century
* fix: same year capture-group bug in the services search copy
* test: research query extracts the full year
get_tool_index() calls index_builtin_tools() on first init
(src/tool_index.py:469-470), and _warmup_tool_index then calls it
explicitly right after. Every cold boot embeds all 58 built-in tools
twice and double-upserts them into the ChromaDB collection.
The remaining get_tools_for_query call still pre-warms the query path.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
Small update to the styles that bothered me, i noticed in the window/modal for calendar when editing a day the time icons had a mask that overlapped the icon. I simply added 'background-image: none' prop to it/
Two bugs caused GPU inference to silently fall back to CPU inside the
Odysseus Docker container even when the GPU was correctly passed through.
## entrypoint.sh — CUDA_HOME detection only covered CUDA 13.x wheels
The nvcc glob only searched
vidia/cu13, which matches the
vidia-nvcc-cu13 pip wheel layout. CUDA 12.x wheels install nvcc to
vidia/cuda_nvcc/bin/nvcc (nvidia-cuda-nvcc-cu12) or
vidia/cu12
(nvidia-nvcc-cu12) — completely different paths. The glob found nothing,
so CUDA_HOME was never set.
Worse, VLLM_USE_FLASHINFER_SAMPLER=0 was inside the same if-block, so it
was never set either. vLLM then tried to JIT-compile the FlashInfer
sampler at startup, failed with 'Could not find nvcc', and crashed — even
though the GPU was fully visible to the container.
Fix: expand the search to also check nvidia/cu12 and nvidia/cuda_nvcc.
Move VLLM_USE_FLASHINFER_SAMPLER=0 to an unconditional export after the
loop (it is sampler-only, no impact on the attention path, and the correct
setting for any container where CUDA headers may be incomplete).
## cookbook_routes.py — llama.cpp Linux source build silently fell back to CPU
The cmake invocation was:
cmake -B build -DGGML_CUDA=ON 2>/dev/null || cmake -B build
2>/dev/null suppressed all configure errors. When nvcc is absent (the
slim base image has no CUDA toolkit — intentional), cmake fails silently,
then the || fallback re-runs without -DGGML_CUDA=ON. A CPU-only binary is
produced with no warning. Additionally, a stale CMakeCache.txt from the
failed CUDA attempt was reused (no rm -rf build), poisoning the next
configure run. The macOS branch already did rm -rf build for exactly this
reason; the Linux branch did not.
Fix: before cmake, detect pip-installed nvcc across the same three path
patterns as entrypoint.sh and expose it via CUDA_HOME/PATH. If nvcc is
found, run a clean CUDA build with full error visibility. If not, fall
back to a CPU build with an explicit warning telling the user how to get
a GPU build (install vLLM via Cookbook -> Dependencies, which brings the
CUDA wheels including nvcc, then re-launch).
## .env.example — document Windows COMPOSE_FILE separator
Added a comment showing the semicolon separator required on Windows
Docker Desktop alongside the existing colon-separator (Linux) example.
* fix: drop over-broad 'cookie'/'copyright' low-quality markers
* fix: detect cookie/copyright boilerplate via phrases, not bare words
* test: keep research findings that merely mention cookies or copyright
The memory import-review list (.memory-suggestions) is shown inside the
overflow:hidden .admin-card but, unlike the sibling .memory-list, it had
no scroll bounding of its own (no flex:1 / min-height:0 / overflow-y).
A long review list therefore grew past the card and was clipped, leaving
lower entries and their controls unreachable with no usable scroll area.
Give .memory-suggestions the same flex:1 + min-height:0 + overflow-y:auto
bounding the memories list already uses so the review list scrolls
internally within the modal. Pin the review header (the title and the
save all / back controls) with position:sticky so they stay visible while
the items scroll under them, and add a small scrollbar gutter so the bar
does not sit flush against the item cards.
Fixes#455
* fix: fail fast when ChromaDB is unreachable instead of blocking startup
* fix: only cache the ChromaDB client after a successful heartbeat
* test: cover ChromaDB fast-fail preflight and no-cache-on-failure
Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):
- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
inbound HTML through the allowlist sanitizer, but the two threaded paths
(_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
server-parsed `body_html` raw into the DOM. A crafted inbound email could
inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
if a deployment relaxes the script CSP). Now sanitized on all paths.
- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
user-controlled folder/uid and no containment, so a folder like `../../tmp`
could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
single safe segment and asserts containment.
- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
/api/rag/stats, /api/test/youtube, /api/test-research relied only on the
global session check (any logged-in user). Now require_admin-gated.
- Defense-in-depth HTML escaping: session HTML export escapes the session name
(routes/session_routes.py); the MCP OAuth page escapes the reflected Host
header / server_id (routes/mcp_routes.py).
- Internal-tool token now compared with secrets.compare_digest (constant time)
in core/middleware.py and app.py.
Adds regression tests in tests/test_security_regressions.py.
The Cookbook fit scanner was reporting impossibly low VRAM requirements
for some pre-quantized models — e.g. cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit
shown as 7.1 GB ('perfect' on a 12 GB card) when the real load is ~40 GB.
Root cause is in the catalog builder. When _entry_from_modelinfo falls
back to safetensors metadata for the parameter count, it stored
safetensors.total directly. For pre-quantized repos that figure reflects
*packed* element counts: AWQ/GPTQ-Int4 pack 8x 4-bit weights into one
I32, AWQ-8bit/GPTQ-Int8/FP8 pack 4x. The catalog therefore recorded
~1/8 of the real parameter count, and min_vram_gb = packed * bpp
double-applied the quantization.
Fix the safetensors fallback:
* prefer the per-dtype parameters dict when available and unpack only the
I32/I64 entries (the F16/BF16 scale/zero tensors and embeddings are
already at their real element counts)
* fall back to total * pack_factor when only total is exposed
Patch the catalog entries that were affected by the old fallback so the
fit ratings reflect reality without waiting for a full catalog rebuild:
* cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit 11.4B -> 79.7B (40.8 GB VRAM)
* stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ 4.6B -> 30.5B
* stelterlab/NVIDIA-Nemotron-3-Nano-30B-A3B-AWQ 5.1B -> 30.5B
* warshanks/Qwen3-8B-abliterated-AWQ 2.2B -> 8.2B
* QuantTrio/sarvam-30b-AWQ 7B -> 30B
* QuantTrio/sarvam-105b-AWQ 19B -> 105B
Closes#377.
* fix: populate window._myEmailAddress from the active email account
* fix: keep Cc recipients in reply-all when own address is empty or unknown
* test: cover reply-all recipient building (issue #360)
* fix: coerce null endpoint_url when delivering task result to a session
* fix: also coerce null model so the session insert satisfies NOT NULL
* test: cover task session delivery on an empty database
* feat(web-fetch): add web_fetch tool to read a specific URL's content
* test(web-fetch): add SSRF coverage and fail closed on empty DNS resolution
Add explicit SSRF regression tests for the web_fetch path covering
loopback, private LAN ranges, link-local/metadata, IPv6 private/local,
redirect-into-private, and unsupported schemes. Harden _public_http_url
to fail closed when a hostname resolves to no addresses.
* fix: show docker as N/A inside the container
* test: cover in-container docker detection
* fix: make the N/A dependency chip legible
* refactor: make remote docker applicability explicit and tested
* fixed confusing credentials prompt
* fix(setup): return status from create_default_admin function
* fix(setup): initialize admin creation status in main function
* fix(setup): enhance admin creation feedback and status handling
* Enhance admin user login messages with conditional feedback based on creation status
* Refine admin user creation feedback messages for clarity and actionability and formatted code
* Add fallback error message for admin creation failure in setup script
Gate Cookbook "Run" on the model being downloaded
The What-Fits tab's quick "Run" button launched a serve task even when
the model was not downloaded. It POSTed directly to /api/model/serve and switched to the Running tab, so vLLM/SGLang would background-pull at launch (and llama.cpp just errors "No GGUF found") while the task showed as "running" without actually serving anything.
The Configure button and the Serve tab already gate on the cached-model
list; quick-Run did not. Mirror that gate: when the model isn't cached,
honor the button's "Download" half by kicking off the download instead of spawning a phantom serve task, and toast the user to Run again once it finishes.