odysseus

Author	SHA1	Message	Date
Ashvin	ba43c73d2a	fix(agent): confine glob literal lookups to the search root (#5010 ) GlobTool resolves its search root through _resolve_search_root (which confines it to the workspace or default allowlist), but the literal fast-path joined the model-supplied pattern onto that root without re-confining it. os.path.join lets an absolute pattern or one containing ../ escape the root, and normpath collapsed the .. segments, so glob returned the absolute path of arbitrary host files once they existed -- an existence/path oracle that bypasses the confinement read_file, write_file, grep, and ls all enforce. Keep the literal lookup inside the root via a commonpath containment check; an escaping literal falls through to the os.walk matcher, which only ever yields paths under the root. Wildcard matching was already confined.	2026-06-30 17:49:53 +01:00
Tal.Yuan	41420c59fc	refactor(routes): move memory domain into routes/memory/ subpackage (#5007 ) Slice 2c of the route-domain reorganization (#4082/#4071, per specs/architecture-runtime-inventory.md §6.3). Moves memory_routes.py into routes/memory/, leaving a backward-compat sys.modules shim at the old path. Pure file reorganization, no behavior change. The shim uses sys.modules replacement (same pattern as the merged gallery #4903 and research #4975 slices) so that `import routes.memory_routes`, `from routes.memory_routes import X`, `importlib.import_module(...)`, and the `import ... as mr` + `monkeypatch.setattr(mr, ...)` pattern used by test_memory_routes_session_owner.py / test_memory_owner_isolation.py all operate on the same module object the application uses. The canonical module does NOT depend on the shim — routes/memory/ memory_routes.py imports only from services/, core/, src/, and stdlib (zero internal routes/ coupling). Four source-introspection test sites repointed to the new canonical path: - test_direct_upload_limits.py - test_upload_limits_centralized.py (two dict keys) - test_vision_owner_scope.py Adds tests/test_memory_routes_shim.py to pin the sys.modules shim contract (legacy and canonical paths resolve to the same module object; monkeypatch via legacy alias reaches the canonical module). Verified: compileall clean; full suite 4219 passed, 3 skipped.	2026-06-30 17:52:14 +02:00
botinate	69b9bb0869	fix(agent): execute fenced tool calls with inline args and route bare email tool names (#3681 ) * fix(agent): execute fenced tool calls with inline args and bare email tool names Two bugs made local (Ollama) models unable to use email tools, leaving raw fences like ```list_email_accounts {}``` in the chat: 1. _TOOL_BLOCK_RE required a newline right after the fence tag, so a tool call with args on the same line ("```list_email_accounts {}") never matched and was never executed. The fence now matches with optional spaces/newline after the tag. 2. Even when parsed, bare email tool names had no dispatch branch in tool_execution.py and fell through to "Unknown tool type". They now route to the email MCP server as mcp__email__<name>, matching how function_call_to_tool_block already maps them for native callers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(security): block all bare email tool names for non-admins; harden fence-tag regex Review follow-up on #3681 (thanks @vgalin): 1. Routing bare email names made 10 of the 14 email tools executable by non-admin owners — is_public_blocked_tool() runs on the bare name before dispatch, and NON_ADMIN_BLOCKED_TOOLS only listed 4. Define the full email tool set once (BUILTIN_EMAIL_TOOLS in tool_security.py) and derive the blocklist, the fence tags (TOOL_TAGS), the bare-name dispatch, and the native-call mapping from it so they can't drift. This also fixes 4 tools (search_emails, draft_email, draft_email_reply, ai_draft_email_reply) that were missing from the old tool_schemas copy and therefore unreachable even for native function-calling models. 2. The relaxed fence regex from the previous commit could prefix-match longer fence tags: ```python3 parsed as tool "python" with content "3\nprint(...)" and executed as code. Add a (?![\w-]) boundary after the tag. Tests: test_public_agent_policy_blocks_sensitive_tools now covers all 14 bare email names + the mcp__email__ form; new tests/test_fenced_inline_args.py pins inline-args parsing, the python3/hyphenated-tag non-matches, and strip/parse display mirroring. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(security): gate bare and mcp-qualified email names together; stop executing Markdown info strings Review follow-up on #3681 (thanks @RaresKeY): 1. P1: execute_tool_block() checked disabled_tools / the turn ToolPolicy only against the incoming block name, then the bare-email branch qualified it to mcp__email__<name> and called the MCP manager. Plan mode and the MCP settings toggle write the QUALIFIED name into the denylist, so a bare fence like ```list_emails``` sailed past a mcp__email__list_emails entry. Both gates now match on both spellings (bare <-> mcp__email__-qualified), in either direction. 2. P2: the relaxed fence regex accepted arbitrary same-line text after a recognized tag, which made ordinary Markdown info strings executable: ```python title="example.py" ran as a python tool call. Same-line content now only counts as tool input when it starts with { or [ (JSON args); anything else leaves the fence as display text, and strip_tool_blocks mirrors that (the fence stays visible). Tests: disabled-tools alias regression (qualified entry blocks bare name and vice versa, never reaching the MCP manager), ToolPolicy alias regression, python/bash title="..." non-execution + display retention, and inline JSON-array args still parsing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(security): reject brace-style fence metadata; cover the full email set in the friendly toggle Review follow-up round 3 on #3681 (thanks @RaresKeY): 1. Brace-style fence metadata no longer executes. The previous narrowing still treated any same-line {/[ after a recognized tag as tool input, so ```bash {title="setup"} ran as a bash call. The fence header is now captured separately and judged by one predicate shared between parse_tool_blocks and strip_tool_blocks (_fenced_tool_call), so the execute and display decisions can't disagree: same-line content only counts as inline args when the tag is NOT a code tag (bash/python never take same-line args — that text is Markdown fence attributes) AND the inline text (plus any continuation lines) parses as standalone JSON. ```bash {title="setup"}, ```python {"title":"example.py"} and ```list_emails {title="x"} all stay visible and inert. 2. The friendly `disable_tool email` toggle covered 3 of the 14 email tools (mcp__email__{list_emails,read_email,send_email}); the other bare aliases this PR routes stayed executable after an operator disabled email. The alias now derives from BUILTIN_EMAIL_TOOLS in BOTH spellings — bare (function-schema hiding, bare-fence dispatch) and mcp__email__* (MCP schema hiding, qualified runtime blocks) — so the toggle and the runtime gate can't drift apart. Tests: brace/bracket metadata regressions for parse and strip symmetry (code tags, invalid-JSON inline on a JSON tool, multi-line inline JSON still parsing), and disable_tool/enable_tool email covering all 14 names in both spellings. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(email): close remaining email-tool registry drift; classify every email tool for plan mode Deep self-review follow-up on #3681. Three review rounds each found another hand-maintained copy of the email tool list that had drifted; this commit hunts down ALL remaining copies and pins them to BUILTIN_EMAIL_TOOLS. The same 5 tools (search_emails, draft_email, draft_email_reply, ai_draft_email_reply, download_attachment) were missing from every advertising surface, so they were dispatchable but never offered: - FUNCTION_TOOL_SCHEMAS: native function-calling models never saw them (the round-1 fix covered dispatch only); schemas added, mirroring the email server's inputSchema definitions. - TOOL_SECTIONS: fenced-block models were never told about them; prompt sections added. - tool_index: absent from the RAG embedding registry (never retrievable), the email keyword hints, and the scheduled assistant's always-available set — the latter two now derive from BUILTIN_EMAIL_TOOLS. - agent_loop._DOMAIN_TOOL_MAP["email"], tool_policy._COMMON_TOOL_NAMES, the assistant tool-selector UI groups (assistant.js), and the default Assistant crew seed (task_scheduler) now derive from / cover the set. Plan mode now classifies every email tool explicitly: - list_email_accounts and search_emails join PLAN_MODE_READONLY_TOOLS. Without this, list_email_accounts sat in the plan-mode bare denylist (schema-derived) while its qualified form passed the MCP read-only filter — and the round-2 bare/qualified alias gate would have blocked the qualified call too, regressing read-only email discovery in plan mode. - draft_email, draft_email_reply, ai_draft_email_reply, and download_attachment join the fail-closed mutator backstop (drafts create documents; download_attachment writes to disk). Tests: tests/test_email_registry_sync.py pins every registry (including the email server source and assistant.js) to BUILTIN_EMAIL_TOOLS and asserts the plan-mode partition, so the next email tool can't drift; a parse/strip mirror grid covers 192 fence shapes (tag x header x body) asserting executed <=> stripped. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor: move the email alias rule into tool_security; extract the assistant seed constant Code-quality pass over the PR's own changes: - The bare<->qualified email aliasing rule lived inline in the generic dispatcher (_execute_tool_block_impl). It is policy knowledge, so it moves next to BUILTIN_EMAIL_TOOLS as email_tool_policy_names(); the dispatcher just consumes it, and the rule gets its own unit test (including the mcp__email__<not-a-tool> and mcp__other__ non-alias cases). - The default Assistant's enabled_tools list was an inline literal inside the CrewMember seed, and its registry-sync test asserted a source-code substring. Extracted to DEFAULT_ASSISTANT_ENABLED_TOOLS so the test imports and checks the actual value. - _fenced_tool_call return type tightened to Optional[Tuple[str, str]]. No behavior change; suite green (3295 passed). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * revert: move the email registry consolidation to a follow-up PR Per review feedback on scope, this PR stays narrow: fenced inline-args parsing, bare email tool routing, and the directly required safety gates. This commit reverts the registry/advertising consolidation from db29046 and 016ce47 (native schemas, prompt sections, RAG description index + keyword hints, assistant always-available set, guide-only known-names union, frontend tool-selector groups, default assistant seed, and their sync tests) — all of that moves to a dedicated follow-up PR together with the _EMAIL_TOOL_HINTS finding. Kept here because the narrow scope needs them: - email_tool_policy_names() in tool_security + its use in the execute_tool_block gates and its unit test (refactor of this PR's own round-2 alias fix), - list_email_accounts in PLAN_MODE_READONLY_TOOLS (the alias gate works both ways, and the schema-derived plan-mode bare denylist would otherwise block the qualified read-only call too), - the parse/strip mirror grid test (parser scope), - the narrow registry sync tests (email server <-> BUILTIN_EMAIL_TOOLS match, fence-tag coverage, non-admin blocklist coverage). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(email): execute empty email fences with empty args; reject non-object JSON args Two gaps found by replaying captured local-model traffic against the narrowed branch: 1. ```list_email_accounts``` with NO body — a shape gemma really emits for no-arg tools — was silently dropped (parse skips empty content), so the model concluded email was broken: the original #337 symptom through a different door. Empty fences whose tag is a built-in email tool now dispatch with {} args and the tool's own validation answers (e.g. an empty send_email returns "to is required" instead of silence). Empty bash/python/other fences keep skipping, and strip stays mirrored (the fence was executed, so it is removed). 2. The fence parser accepts JSON arrays as inline args, but the email dispatch parsed only objects — an array silently became {} args. Non-object JSON now returns a correctable "arguments must be a JSON object" error before reaching the MCP server (same class as #3966). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(security): classify all email tools for plan mode statically; reject invalid email JSON bodies Review follow-up round 5 on #3681 (thanks @RaresKeY): 1. This PR makes every BUILTIN_EMAIL_TOOLS name fence-taggable, so each one must be explicitly classified for plan mode — the draft tools and download_attachment were in neither the read-only allowlist nor the static denylist, leaving their bare-alias plan-mode safety dependent on the MCP read-only inventory being present and current. search_emails joins PLAN_MODE_READONLY_TOOLS (explicit, not allowed-by-omission); draft_email, draft_email_reply, ai_draft_email_reply, and download_attachment join the fail-closed _PLAN_MODE_KNOWN_MUTATORS backstop. (Moved back from the #4053 split: the partition is directly required for this PR to merge independently.) 2. The classic tag/body fence form reaches execution unvalidated (only INLINE args are JSON-checked by the parser), so a body like {account: "work"} silently became {} args and read the DEFAULT mailbox instead of the intended one. JSON-looking bodies that fail to parse now return a correctable "not valid JSON" error before reaching the MCP server. Tests: a partition invariant (every email tool is explicitly read-only or plan-mode-denied), a mutating-alias probe that uses only the static denylist with a fake MCP manager (no inventory layer), and the body-form invalid-JSON regression. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(tool-dispatch): decode inline JSON args for legacy MCP tools; reject all non-object email bodies Review follow-up round 6 on #3681 (thanks @RaresKeY) — both pre-existing on this branch, surfaced by the relaxed inline-args parser: 1. The relaxed parser accepts inline JSON for every non-code tag, but the legacy line-based arg builders (web_search/web_fetch/read_file/ write_file/generate_image/manage_memory) wrapped the whole JSON string as the query/url/path/prompt — so `web_search {"query": "x"}` executed as a search for the literal string `{"query": "x"}`. _build_mcp_args now uses a fenced JSON object directly when it carries the tool's primary arg key (query/url/path/prompt/action). Keyed off membership so it can't drift; an object without the primary key (e.g. a freeform JSON query, or bare object content for write_file) falls through to the line parser unchanged. Also fixes the same corruption for the classic newline-JSON form. 2. The bare-email dispatch only rejected bodies starting with { or [, so a non-empty non-JSON body like `account: work` still fell through to {} args and silently read the DEFAULT mailbox. Now ANY non-empty body must decode to a JSON object or it returns a correctable error; only a truly empty body keeps the no-arg path (```list_email_accounts```). Tests: inline-JSON arg decoding for the five legacy tools plus the freeform and missing-primary-key fallbacks; the email body rejection extended to cover the brace-looking and bare `key: value` shapes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(tool-dispatch): drop dead manage_memory JSON-decode entry; pin the live-path invariant Self-audit catch on the round-6 fix. manage_memory was added to _MCP_JSON_PRIMARY_KEYS, but _build_mcp_args is only reached via _call_mcp_tool, which only runs for _MCP_TOOL_MAP tools — and manage_memory isn't one (its tag routes through dispatch_ai_tool -> do_manage_memory, which line-parses). So the round-6 decode for manage_memory was dead code: the unit test exercising _build_mcp_args passed while a real `manage_memory {"action": ...}` fence still parsed the whole JSON blob as the action. Remove the dead entry and add test_mcp_json_primary_keys_are_all_live, which asserts every JSON-primary tool is in _MCP_TOOL_MAP so a dead decode can't be added again. The same inline-JSON corruption for manage_memory and the other tools that route through positional dispatchers (create_session, ui_control, send_to_session, search_chats, the document tools, etc.) is pre-existing (dev corrupts their newline JSON form too) and tracked separately; the proper fix there is to route fenced JSON through function_call_to_tool_block. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(tool-dispatch): decode inline JSON in WriteFileTool (its live path); round-6 fix was on the dead MCP path Self-audit: round 6 claimed to fix inline JSON args for write_file via _build_mcp_args, but there is no filesystem MCP server, so write_file always runs through _direct_fallback -> WriteFileTool, never through _build_mcp_args. WriteFileTool — unlike its siblings ReadFileTool / WebSearchTool / WebFetchTool, which all decode JSON — took lines[0] as the path, so `write_file {"path": "/tmp/x", "content": "y"}` wrote to a file literally named with the JSON blob. The round-6 _build_mcp_args entry decoded correctly but on a path that never executes (same class as the manage_memory dead entry), and the round-6 unit test passed on that dead path. WriteFileTool now decodes a JSON object carrying "path" (matching ReadFileTool directly above it), and the comment on _MCP_JSON_PRIMARY_KEYS records that only generate_image has a live MCP server today — the other entries are defense-in-depth for the MCP path; the live fix for each server-less tool is in its handler. Test: test_write_file_inline_json_args drives the LIVE path (execute_tool_block with no MCP) and asserts the intended path is used — verified to fail without the handler fix. web_search/web_fetch/read_file were already correct (their handlers decode); write_file was the gap. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(strip-fence): derive the live-strip TOOL_TAGS from the real set Semantic conflict from the dev merge that textual auto-merge didn't flag: dev added test_live_strip_email_tool_fences.py whose _tool_tags() helper source-scrapes only the TOOL_TAGS literal `{...}`, which worked on dev because the email tool names were listed inline there. This branch makes TOOL_TAGS the single source — `{...} \| BUILTIN_EMAIL_TOOLS` — so the email names are no longer in the literal and the scraper missed them, leaving the email-fence strip assertions failing even though TOOL_TAGS does contain them at runtime. Import the real TOOL_TAGS instead of scraping source, so the test mirrors exactly what GET /api/tools serves (sorted(TOOL_TAGS)) and the live EXEC_FENCE_RE derives from — robust to however the set is composed. The source-level frontend/route guards in the same file are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: botinate <285686135+botinate@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 16:50:32 +01:00
pewdiepie-archdaemon	e131245c91	Merge remote-tracking branch 'origin/dev'	2026-06-30 10:26:46 +00:00
red person	df9c20e6c2	Ignore invalid context budget numbers (#1831 )	2026-06-29 19:56:17 +01:00
red person	bbbe145247	Ignore non-string personal doc text (#1832 )	2026-06-29 19:24:29 +01:00
red person	387f95187e	Ignore invalid harmonize mask layers (#1829 )	2026-06-29 19:16:26 +01:00
red person	00dfd2d47a	Keep snap helper safe without context (#1828 )	2026-06-29 18:54:44 +01:00
red person	d2a6d73aa5	Ignore invalid serve profile inputs (#1827 )	2026-06-29 18:47:19 +01:00
red person	139d76ab57	Reject resolver results without IPs (#1826 )	2026-06-29 16:32:32 +01:00
pewdiepie-archdaemon	19e2326a6f	Rescue plain UI open-panel tool text	2026-06-29 14:07:48 +00:00
red person	3021569081	Reject non-string atomic text writes (#1819 )	2026-06-29 14:36:21 +01:00
red person	a326a6a555	Skip invalid notes CLI item rows (#2005 )	2026-06-29 14:26:46 +01:00
red person	dff79319d7	Normalize gallery CLI text fields (#2012 )	2026-06-29 13:47:29 +01:00
red person	9731048ecd	Ignore non-string mail CLI recipients (#1824 )	2026-06-29 13:41:22 +01:00
pewdiepie-archdaemon	c0a68acfc8	Guard document style against persona guessing	2026-06-28 21:49:09 +00:00
Alexandre Teixeira	893e490cdc	test: split provider endpoint tests (#4961 )	2026-06-28 19:05:38 +02:00
Alexandre Teixeira	bad9ec2f9c	test: localize calendar recurrence helper import (#4944 ) * test: localize calendar recurrence helper import * test: share calendar route import helper	2026-06-28 19:04:15 +02:00
nikakhalatiani	927b1f7ecf	fix(llm): normalize OpenAI-compatible chat URLs Normalize OpenAI-compatible chat URL shapes so base /v1 endpoints route to /v1/chat/completions while already-full chat endpoints remain idempotent. Preserve native local Ollama routing for bare localhost:11434 endpoints, keep localhost:11434/v1 as OpenAI-compatible, and add focused regression coverage for provider detection, chat target URLs, and model listing from /v1. Part of #541.	2026-06-28 15:30:15 +01:00
pewdiepie-archdaemon	7094c8e285	Merge dev into main for testing	2026-06-28 14:07:23 +00:00
Tal.Yuan	bb2148db73	refactor(routes): move research domain into routes/research/ subpackage Move the research route domain into the canonical routes/research/ subpackage while preserving the legacy routes.research_routes import path through a sys.modules compatibility shim. The moved canonical module is behavior-preserving, app wiring now imports the canonical route setup function, source-introspection tests point at the new canonical path, and shim regression coverage pins legacy/canonical same-object behavior plus string-targeted monkeypatch reach-through. Refs #4082. Refs #4071.	2026-06-28 14:34:11 +01:00
Michael	e018c7cf6c	fix(cookbook): accept $(find) subshells in serve command validation Allow the generated Cookbook mmproj lookup command substitution while keeping serve-command validation constrained to explicit safe subshell patterns. Preserves the existing safe printf substitution, allowlists the generated find/sort/head mmproj lookup shape, and adds negative regression coverage for unrelated substitutions and pipelines. Fixes #4772.	2026-06-28 14:00:49 +01:00
nopoz	a7fc1343a3	fix(security): prevent ReDoS in verdict-prose and continuation matchers (#4943 ) Two py/polynomial-redos sinks ran regexes with two adjacent \s-matching quantifiers over untrusted model text, backtracking O(n^2) when the tail failed on a whitespace flood: - routes/skills_routes.py: the last-resort verdict-from-prose extractor used `["\'\s:]\s` — the class already matches \s, so the trailing \s* was a redundant second quantifier. Dropped it (extracted to a documented module constant _VERDICT_PROSE_RE); the matched text is identical, the scan linear. - src/agent_loop.py _EXPLICIT_CONTINUATION_RE: `\s[.!?]\s$` put two \s around `[.!?]`. Rewrote as `\s(?:[.!?]+\s)?$` — same accepted tails (no two \s adjacent), linear. Portable form (no possessive quantifiers). Both verified output-equivalent to the originals across a fuzz corpus. Adds tests/test_redos_verdict_continuation.py pinning the unchanged match sets and bounding the flood inputs (old patterns took seconds at 40k whitespace chars).	2026-06-28 11:42:20 +01:00
red person	827a6b2778	Reject blank ownerless claim owner (#4929 )	2026-06-28 10:57:11 +01:00
Tal.Yuan	8066a8e0cd	refactor(routes): move gallery domain into routes/gallery subpackage (#4903 ) Move the gallery route domain into routes/gallery/ while preserving backward-compatible legacy import shims. - app imports the canonical gallery route module - canonical gallery route code imports canonical gallery helpers - legacy gallery route/helper paths remain compatibility aliases - add shim regression coverage for module identity and monkeypatch behavior - repoint gallery source-introspection tests to the canonical paths No intended behavior change.	2026-06-28 10:40:34 +01:00
Rudra Sarker	5b8bfdabab	fix(chat): sanitize web search query to strip markdown and code blocks (#4863 ) Layer a defensive cleanup on top of the generated-query web-search flow so the final selected query is sanitized before reaching comprehensive_web_search. - remove fenced code blocks from the final search query - preserve inline code as plain text - collapse whitespace and cap query length - cover generated-query success plus LLM failure/empty fallback paths Partially addresses #4547.	2026-06-28 01:23:08 +01:00
tanmayraut45	ff0f1b3450	fix(mcp): retain builtin startup tasks and reap npx probe Keep strong references to builtin MCP startup tasks until completion and kill/reap the npx probe subprocess when cancellation interrupts the probe. Includes focused regression coverage for both lifecycle paths.	2026-06-28 01:18:17 +01:00
Pedro Barbosa	9782e5bc94	fix(cookbook): load user-site pth hooks for runtime installs Replay user-site .pth hooks when checking cookbook runtime dependencies so packages installed with --user are visible to dependency completion. Includes focused regression coverage.	2026-06-28 01:01:44 +01:00
tanmayraut45	c01c09559a	fix(ai): offload model resolution from async paths Wrap blocking _resolve_model calls in asyncio.to_thread across async model interaction paths so endpoint/model resolution does not stall the event loop. Preserve owner-scoped resolution and add focused regression coverage.	2026-06-28 00:48:35 +01:00
hestiaOS	8b110c28e6	fix(tasks): keep scheduled-task prompt cache stable Move scheduled-task current-time context out of the system prompt and into a user-role context message so the system prompt remains stable for prompt caching. Preserve time grounding on both the agent-loop path and fallback direct-call path, with focused regression coverage.	2026-06-28 00:05:02 +01:00
Alexandre Teixeira	259662e914	test: split endpoint resolver tests (#4957 )	2026-06-28 00:49:43 +02:00
nopoz	fbe3a0d73b	fix(security): prevent ReDoS in XML and args tool-call parsers (#4941 ) * fix(security): prevent ReDoS in XML and args tool-call parsers Four py/polynomial-redos sinks in tool_parsing.py ran lazy/greedy regexes over untrusted model output (tool-call markup is attacker-influenced via prompt injection). When the closing delimiter was absent, each rescanned to end-of-string from every opener -> O(n^2): - args => { ... } in _parse_tool_call_block: greedy \{([\s\S])\} restarted from every `args:{` opener. Now finds the opener once and takes through the last `}` (rfind) — equivalent capture, O(n). - _XML_INVOKE_RE: lazy <invoke ...>([\s\S]?)</invoke>. Now _iter_xml_invoke pairs each opener with the first reachable </invoke> and stops when none is. - _XML_DIRECT_TOOL_RE and the <tag>([\s\S]?)</\1> param scan in _parse_tool_code_block: lazy backreference patterns. Now _iter_backref_blocks pairs each opener with the nearest matching closer and memoizes tag names with no remaining closer, so an opener flood stays O(n). All four are output-equivalent to the originals on well-formed tool-call markup; the lazy patterns remain defined (still re-exported via agent_tools) but no longer drive a finditer over untrusted text. Adds tests/test_redos_xml_tool_parsers.py pinning correctness and bounding the opener-flood inputs (old paths took 4-15s). fix(security): harden invoke-parameter and distinct-name tag scans Forward-only the two residual ReDoS paths in the XML/tool parsers that the outer-delimiter fix left quadratic: - _parse_xml_invoke parsed <parameter> with _XML_PARAM_RE.finditer, so a closed <invoke> body full of unclosed <parameter> openers rescanned the body from every opener (O(n^2), ~11s at 8k openers). Now scans forward-only via _iter_named_blocks, factored out of _iter_xml_invoke. - _iter_backref_blocks only memoized repeated missing tag names; a flood of distinct unclosed names searched the suffix once per name (O(n^2)). It now indexes every closer by name in one linear pass and binary-searches per opener (O(n log n)). Covers the direct and tool_code backref scans. Output-equivalent to the prior scanners (200k randomized trials match the memoized version for both the direct ci=True and tool_code ci=False configs). Adds regressions for the closed-invoke parameter flood and the distinct-name floods (45k openers now run in ~0.05s, were 5-6s).	2026-06-27 15:42:55 -07:00
Solanki Sumit	df9907c09f	fix(health): report unhealthy memory vector store as degraded Keep an unhealthy MemoryVectorStore instance available for health reporting instead of discarding it as disabled. This lets health checks report a degraded/down vector-store state while preserving focused regression coverage for initializer behavior.	2026-06-27 22:25:13 +01:00
Ricardo	3b4187e25d	fix(email): don't probe IMAP for send-only (SMTP-only) accounts (#4830 ) An account configured with SMTP only (no imap_host) has no inbox, but the inbox list path still called _imap_connect, which handed an empty host to imaplib. imaplib.IMAP4("", 993) silently dials localhost:993 and fails with "[Errno 111] Connection refused", so the email panel's poll logged a "Failed to list emails" ERROR every ~60s and surfaced a scary error in the UI. _imap_connect now fails fast with a typed EmailNotConfiguredError (subclass of RuntimeError, so existing broad handlers keep working) when no imap_host is set, and the inbox list returns an empty result for that case instead of an error. SMTP send is unaffected.	2026-06-27 21:52:26 +01:00
Alexandre Teixeira	20cf323ca4	test: split provider detection tests (#4933 )	2026-06-27 21:46:33 +01:00
Alexandre Teixeira	2497160fd4	test: split llm-core temperature tests (#4935 )	2026-06-27 22:02:41 +02:00
Afonso Coutinho	70d806019b	fix: tool results misthreaded to the wrong tool_call_id when a native call fails to convert (#1917 ) * fix: tool results misthreaded when a native call fails to convert * Unpack the third converted_calls return from _resolve_tool_blocks in the fenced-example tests	2026-06-27 19:31:17 +01:00
muhamed hamed	3e7af8634f	fix: improve uploaded document retrieval and deep research reuse (#4784 ) * fix: improve uploaded document retrieval and deep research reuse * test: add coverage for upload manifest and document pagination * chore: rerun CI * fix: restore _insert_before_latest_user helper * fix(agent_loop): restore missing upload context helper	2026-06-27 19:24:17 +01:00
Solanki Sumit	7e9bfb1700	fix(chat): guard non-numeric agent tool budget setting Guard the agent_max_tool_calls settings read so hand-edited or agent-written non-numeric settings.json values fall back to 0 instead of crashing agent-mode chat stream initialization. Add regression coverage for guarded coercion.	2026-06-27 19:20:48 +01:00
Arpit	e7c61a75b6	fix(search): use generated query for chat mode web search #4547 (#4557 ) * fix(search): use generated query for chat mode web search #4547 * style(search): tidy query generation call --------- Co-authored-by: Alexandre Teixeira <alexandremagteixeira@gmail.com>	2026-06-27 19:04:46 +01:00
Solanki Sumit	20691d6019	fix(upload): handle corrupt uploads index and malformed vision JSON Use the upload handler's tolerant index loader when reading upload metadata so corrupt uploads.json degrades to missing metadata instead of a 500. Return 400 for malformed vision JSON request bodies and add regression coverage for both paths.	2026-06-27 18:59:28 +01:00
Miraç Duran	228efbc70a	fix(calendar): accept time-first datetimes in _parse_dt Accept calendar datetime phrases such as "3pm tomorrow" by adding a time-first natural-language parser branch mirroring the reminder parser. Add regression coverage proving time-first forms match their existing day-first equivalents.	2026-06-27 18:51:18 +01:00
nopoz	c098355778	fix(security): prevent ReDoS in LLM-output tool/think parsers (#4704 ) * fix(security): prevent ReDoS in LLM-output tool/think parsers The regexes that parse untrusted model output in text_helpers.py and tool_parsing.py are delimiter-bounded with a lazy [\s\S]? (or an ambiguous (\s+[^>])?). Applied with re.sub/re.finditer over a whole response, they degrade to O(n^2) when the closing delimiter is absent: the engine rescans to end-of-string from every opener. Model output is untrusted, so a prompt-injected or malicious model can stall the agent loop with many unclosed openers (measured ~25s on a 60KB <thought flood). - text_helpers.py: replace ambiguous <thought(\s+[^>])?> with <thought([^>])> (identical capture, no \s+/[^>]* overlap); skip the Gemma <\|channel>...<channel\|> subs when no <channel\|> closer is present. - tool_parsing.py: gate _TOOL_CALL_RE, _XML_TOOL_CALL_RE and _TOOL_CODE_RE (in parse_tool_blocks and strip_tool_blocks) on a cheap presence check for their closing delimiter. With no closer the regex cannot match, so skipping is equivalent; only the wasted O(n^2) rescan is removed. Resolves CodeQL py/polynomial-redos #230, #231, #232, #233, #235, #236, #524. The _XML_OPEN_TOOL_CALL_RE alerts (#234, #477) are false positives (its greedy [\s\S]\Z is linear) and left untouched. fix(security): close ReDoS gaps in tool/think parsers from review Addresses two review findings on the closer-guard approach: - Whole-string "closer exists?" checks were bypassable: a stale closer before an opener flood, or a closer with no reachable inner `}`, kept the guard true while every opener still rescanned to end-of-string (O(n^2)). Replace the substring guards with `_iter_delimited`, a forward-only scan that pairs each opener with a later closer and stops once none is reachable (O(n)). `parse_tool_blocks` and `strip_tool_blocks` (via `_strip_delimited`) both use it for the [TOOL_CALL], <tool_call>/<function_call>, and <tool_code> formats. Verified equivalent to the original regexes on well-formed inputs. - `<thought([^>])>` dropped the tag-name boundary and corrupted unrelated tags (`<thoughtful>` -> `<thinkful>`). Use `<thought(\s[^>])?>`: the single fixed `\s` keeps the pattern linear (no `\s+`/`[^>]` overlap) while restoring the boundary; capture is byte-for-byte identical for real `<thought ...>` openers. Adds regressions for stale-closer-before-opener, closer-present-without- inner-brace, and the <thoughtful>/<thoughts> passthrough. fix(security): close Gemma channel ReDoS guard flagged in review vdmkenny noted the same bypassable whole-string guard remained in text_helpers.py: `if "<channel\|>" in out.lower()` gating the Gemma thought/response channel subs. A stale `<channel\|>` before a `<\|channel>thought` opener flood keeps the guard true while every opener still rescans to end-of-string (measured ~7.3s at 4k openers). Replace it with `_sub_delimited`, the same forward-only scan used for the tool-call parsers: pair each opener with a later closer, stop when none is reachable (O(n)). Verified output-equivalent to the original capture regexes on well-formed multi-channel inputs; the stale-closer case now runs in <2ms. Adds a regression for stale-closer-before-opener on the Gemma path. * fix(security): harden strip_think() think-tag ReDoS flagged in review The earlier fixes hardened normalize_thinking_markup and the delimiter scanners, but the production entrypoint strip_think() still ran _THINK_CLOSED_RE / _THINK_ATTR_RE / _THINK_OPEN_RE (and the stray-tag _THINK_TAG_RE) over untrusted model output. Those kept the same ReDoS shapes: the lazy `<open>[\s\S]?</close>` rescanned to end-of-string from every opener, and `(?:\s+[^>])?` / `[^>]` attribute scans ran to end-of-string from every opener on a "many openers, no closer" flood. On the prior head, malformed `<think` / `<thinking` / `<thought` floods took 6-14s through strip_think(). The shipped `<thought>` normalization had the same residual: the single-opener case was linear but an opener flood was still O(n^2) (~4.4s). - Replace the lazy multi-pass _THINK_CLOSED_RE loop with the existing forward-only _sub_delimited scan (pair each opener with the first reachable closer, stop when none is reachable). One pass collapses sequential and nested blocks as before. - Bound every opener/stray-tag attribute scan at `<` (`[^<>]` not `[^>]`) so a no-`>` opener flood can't drive a single match attempt to end-of-string. Identical capture for well-formed think/thought tags. - email_helpers._strip_think: compute had_think from the single linear _THINK_TAG_RE instead of the lazy closed/open `.search()` calls, which had the same O(n^2) on the email reply/summary/extraction paths. All flood variants now finish in <10ms (were 6-14s). Output verified byte-for-byte identical to the prior implementation over a 34-case corpus (nested, mismatched, attr, uppercase, Gemma, prose, prompt-echo). Adds strip_think() timing regressions for malformed openers, opener floods (all three tag names), the closed-opener flood, and the malformed-closer flood. docs: trim verbose comments in think-tag ReDoS fix	2026-06-27 10:12:28 -07:00
Rudra Sarker	090f4078d8	fix(llm-core): prevent cache-affinity fields from reaching Cerebras Recognize api.cerebras.ai as a Cerebras cloud provider so llama.cpp/LM Studio cache-affinity fields are not attached even when endpoint_kind is misconfigured as local. Add regression coverage for provider detection, self-hosted classification, and payload field exclusion.	2026-06-27 18:07:12 +01:00
Afonso Coutinho	ad745801c6	fix(visual_report): ignore fenced headings in TOC extraction Strip fenced code blocks before extracting visual-report headings so heading-looking lines inside code fences do not desync TOC anchors. Add regression coverage for backtick and tilde fences while preserving normal heading extraction.	2026-06-27 17:44:32 +01:00
Miraç Duran	d5286f926e	fix(visual_report): make TOC heading slugs unique Ensure generated visual-report TOC slugs cannot collide with naturally occurring slug names. Add regression coverage for duplicate headings, natural suffix collisions, and unchanged distinct headings.	2026-06-27 17:36:17 +01:00
Ashvin	67040a196f	fix(docker): install python-magic and libmagic for upload MIME sniffing Install libmagic1 and image-scoped python-magic in the Docker image so upload MIME detection can use content sniffing. Add regression coverage for the Dockerfile dependency pair and the libmagic-present sniffing path.	2026-06-27 17:31:46 +01:00
Catalin Iliescu	497c391f84	fix(cookbook): preserve scheduled serve server metadata (#4545 ) Co-authored-by: Cata <cata@bigjohn.local>	2026-06-27 16:48:53 +01:00
Ashvin	a6400c10af	fix(calendar): keep imported events with non-positive duration visible (#4484 ) A single-day all-day event whose source writes DTEND equal to DTSTART (treating DTEND as an inclusive bound rather than the RFC 5545 exclusive one) was stored verbatim as a zero-duration row. list_events selects events overlapping the window with `dtstart < end AND dtend > start`, so that row is filtered out for any window starting at or after its date and the event never appears, even though the import reported success. Events created via the API never hit this because creation always synthesizes a positive duration; only the two import paths can persist a non-positive one. Clamp a non-positive end at import (import_ics and the CalDAV pull) to the same default span used when DTEND is absent: one day for all-day events, one hour otherwise. Also repair the persisted state for users who already imported before this clamp existed. Their stored zero-duration row is invisible, and re-importing the same ICS hit the duplicate branch and skipped without touching it, so the event stayed hidden. The duplicate branch now backfills the clamp onto the matched row before skipping, and the response reports a `repaired` count. (The CalDAV pull already rewrites dtend on re-sync, so it self-heals.)	2026-06-27 16:52:40 +02:00
Afonso Coutinho	16ddfbf966	fix: vCard parser drops folded continuation lines, corrupting emails (#1870 )	2026-06-27 14:41:57 +01:00

1 2 3 4 5 ...

990 Commits