* fix: match skill tags as whole tokens, not substrings, in retrieval
* test: skill tag matching uses whole tokens, not substrings
* test: give skill fixtures status=published so they reach the scoring path
parse_markdown_to_values — the read-back path for export-pdf, the export
preview, and prepare-signed-reply — matched the bold field label with [^*]+, so
it could not match a label containing '*' (the near-universal required-field
marker: "Email *", "State *", "Signature *"). The value then stayed empty, so
the exported PDF and the signed-reply attachment came out blank for that field
with no error — a whole form of required fields could export completely empty.
Match the label non-greedily (.+?) so '*' in labels is tolerated while still
splitting at the first ':**' / '**[', which also preserves a value that itself
contains ':**'.
Adds tests/test_form_markdown_roundtrip.py (render -> parse roundtrip): asterisk
text/choice/signature labels survive (fail before, pass after); plain labels and
colon-bearing values are unaffected.
Co-authored-by: NubsCarson <nubs@nubs.site>
_parse_vcards matched property names with a bare line.startswith("EMAIL") /
"TEL" / "FN:" / "UID:". RFC 6350 property groups — emitted by default by Apple
Contacts / iCloud and many CardDAV servers — prefix the name with a group token,
e.g. item1.EMAIL;type=pref:jane@example.com. Those lines never matched, so emails
and phone numbers from any Apple-synced or Apple-exported address book were
silently dropped (breaking contact search by email, composer autocomplete, and
vCard/CSV export round-trips).
Strip an optional leading group token before matching and value extraction;
no-op for non-grouped lines.
Adds tests/test_contacts_vcard_parse.py (grouped + plain) — the grouped case
fails before this change and passes after.
Co-authored-by: NubsCarson <nubs@nubs.site>
Cookbook dependency installs (vLLM and friends) build large wheels; pip's
default cache lives under $HOME/.cache/pip, so on a small home filesystem the
build dies mid-way with "[Errno 28] No space left on device" (issue #1219) and
the dependency ends up "installed" but unusable (issue #1459).
Add `--no-cache-dir` to the dependency pip-install command (the maintainer's
suggested PIP_CACHE_DIR= workaround, made the default) via a small
_pip_install_no_cache() helper applied at the install chokepoint. Consistent
with the existing --no-cache-dir on the llama-cpp-python build. Idempotent;
non-pip-install serve commands are untouched.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If session.initialize() or list_tools() raises after the stdio
subprocess or SSE connection is already open, the AsyncExitStack is
never closed — leaking the child process or HTTP connection. Wrap the
setup phase in try/except to aclose() the stack before re-raising.
Three test files (test_auth_regressions, test_auth_event_loop,
test_null_owner_gates) install stubs for core.database / core.auth /
src.endpoint_resolver at module-import time, so they outlive the
file and are still present in sys.modules when later-collected test
files try to import the real modules. The stubs are minimal (a
handful of MagicMock attrs) so the import chain that follows fails
with ImportError on the very next real import.
test_companion_pairing also leaks, with a twist: its _DBStub
subclass returns a MagicMock for *any* attribute including dunders,
so the next test that does `from core.database import *` reads
`__all__` as a MagicMock and dies with 'Item in __all__ must be
str, not MagicMock'.
Move the stub installation into an autouse fixture per file and
register each stub with monkeypatch.setitem so sys.modules is
restored to its pre-test state on teardown. Tighten _DBStub to
refuse dunder names so __all__ stays undefined. _CAPTURED is
cleared per test so the mint-token assertions see a fresh dict.
Before: 3 test files fail at collection time (test_chat_image_routing,
test_context_compactor, test_webhook_ssrf_resilience). After: 0
collection errors. 1365/1370 pass, 1 skip, 4 unrelated pre-existing
failures (verified against origin/main baseline).
Out of scope: test_task_scheduler_session_delivery::
test_session_delivery_survives_empty_database also fails in the
full suite due to order-dependent state from a different test
file. That's a separate leak with a different root cause.
PermissionError was not in the except tuple so an unreadable settings.json
would crash the app instead of falling back to defaults. Added alongside the
existing FileNotFoundError/JSONDecodeError/ValueError catches.
Also adds test_settings_error_paths.py covering all four failure modes:
missing file, corrupted JSON, wrong type, and permission denied.
The login page has its own inline <style> and doesn't load static/style.css,
so it never inherited the main app's touch-device rule that pins text inputs
to 16px. Its fields are 0.95rem (~15.2px) and the dynamically-inserted 2FA
input is 14px, so iOS Safari zooms the whole page when either is focused -
on the very first screen every user sees.
Add a `@media (hover: none) and (pointer: coarse)` rule raising
`input:not(.remember-check)` to 16px, mirroring the main app's approach.
!important also lifts the 2FA input, which pins font-size:14px inline.
Desktop is unchanged (inputs stay 0.95rem).
Gemma 4 returns reasoning_content in streaming responses via
llama-server, but the model wasn't listed in _THINKING_MODEL_PATTERNS,
causing reasoning tokens to be mishandled. Add "gemma" to the pattern
list and register Gemma 4's 128K context window in KNOWN_CONTEXT_WINDOWS
so the agent loop budgets context correctly.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The mark-stopped, update-last-meta, and merge-last-assistant handlers in
routes/history_routes.py ordered ChatMessage queries by
DbChatMessage.created_at. ChatMessage does not inherit TimestampMixin and
has only a `timestamp` column, so SQLAlchemy raised AttributeError at
query-build time -> HTTP 500 on Stop, last-message metadata updates, and
Continue/merge. Each handler mutates in-memory history before the failing
query, so a failed request also silently diverged the in-memory view from
the database.
Order by DbChatMessage.timestamp (already used elsewhere in the file and
covered by the ix_messages_session_time index). Add a regression test
pinning the model column reality, the corrected query, and a guard against
re-introducing created_at.
Fixes#1659
Co-authored-by: Ethan <23321960+0xLeathery@users.noreply.github.com>
A native function/tool call whose `arguments` field is valid JSON but not an
object — a bare array like ["ls -la"], or a string/number/bool/null — parsed
fine in function_call_to_tool_block and then every branch called args.get(...),
raising AttributeError ('list'/'str' object has no attribute 'get'). That
propagated out of the streamed agent loop (no surrounding try/except at the
call site in stream_agent_loop) and aborted the user's entire turn. Weaker and
local models routinely emit malformed args like this.
Coerce non-dict parsed arguments to {} (mirrors the existing empty-arguments
behavior), so the tool runs with empty args instead of killing the stream.
Adds tests/test_function_call_non_object_args.py covering array/string/number/
bool/null arguments — they fail before this change and pass after.
* fix: auto-naming for 24h time format
needs_auto_name() required AM/PM suffix for default
frontend-generated names like 'deepseek-v4-flash 17:46:02'.
Frontend uses toLocaleTimeString() which outputs 24h
format in most locales — so the regex never matched and
auto-naming silently skipped.
Made AM/PM optional and added re.IGNORECASE for 'am'/'pm'.
* test: add regression tests for needs_auto_name (24h + 12h + custom)
---------
Co-authored-by: Calculator Dev <dev@calculator.local>
datetime.utcnow() is deprecated in Python 3.12 and removed in 3.14.
Swap the five calls in src/cleanup_service.py for a local _utcnow()
helper returning naive UTC, matching the naive DateTime columns the
archive/delete cutoffs compare against (same approach as the
task-scheduler and core-database slices). Add a regression test
asserting the helper stays naive so the cutoff math can't hit a
naive/aware TypeError.
Part of #1116
- Prefer dataset.raw (original markdown) over innerText in _serializeChatTranscript.
- This prevents HTML-to-text artifacts and redundant newlines added by the browser.
setup.py initialises auth.json and .env on a fresh install but was never
called by the Docker entrypoint, leaving new deployments without admin
credentials or a working config.
Adds a single gosu-wrapped call to setup.py before the final exec drop.
setup.py is fully idempotent (skips existing files) so subsequent starts
are unaffected. || true ensures a setup failure never blocks the app from
starting.
Fixes#1476