Commit Graph

14 Commits

Author SHA1 Message Date
lekt8
f2f437f4a8 feat: add /api/ready readiness probe (DB, data dir, local-first) (#1200)
/api/health is a liveness ping. This adds /api/ready as a readiness /
integrity self-check that returns 503 unless every critical subsystem is
whole, so an orchestrator (Docker/Compose/k8s) can gate traffic on real
readiness rather than mere process liveness:

  - database: opens a connection and runs SELECT 1
  - data_dir: confirms the data directory exists and is writable
  - local_first: reports whether storage stays on the host (informational;
    a remote database is a valid deployment, so it never fails readiness)

The check logic lives in src/readiness.py so it is unit-testable in
isolation; the route is a thin wrapper. Covered by tests/test_readiness.py.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 23:33:22 +09:00
tanmayraut45
55fa223e4d Exempt task webhook trigger from session auth (#784)
POSTing to the per-task webhook URL shown in the Tasks UI returned 401
Unauthorized even though the URL is labelled "no auth needed". The
trigger handler at routes/task_routes.py:873 (`POST
/api/tasks/{task_id}/webhook/{token}`) was written as an
unauthenticated endpoint — the 32-byte path-embedded `webhook_token`
generated by `secrets.token_urlsafe(32)` is the credential, and the
handler validates it against the row before doing anything. But
AuthMiddleware in app.py runs first and only knows about
AUTH_EXEMPT_EXACT (static path set) and AUTH_EXEMPT_PREFIXES (only
`/static`), so every external POST (curl, Zapier, n8n, Make,
Activepieces) got rejected before the route ever saw the request.
External callers can't supply a session cookie, which is precisely
why the per-task token exists.

Fix: add an AUTH_EXEMPT_PATTERNS list of compiled regexes for dynamic
public paths and route `^/api/tasks/[^/]+/webhook/[^/]+/?$` through
it. The route handler still enforces `ScheduledTask.webhook_token ==
token` and 404s on mismatch, so an attacker without the token gets a
404 (indistinguishable from a non-existent task), and a holder of the
token gets the documented "POST and a task fires" behaviour. The
sibling endpoint `/{task_id}/webhook-regenerate` is admin-gated and
deliberately does NOT match the pattern — it requires `_owner(request)`
and a session.

Tests: tests/test_webhook_trigger_auth_exempt.py extracts the regex
list out of app.py, applies it to a representative trigger path
(positive) and the four neighbouring task paths that must stay
authenticated (negative — `/api/tasks`, `/api/tasks/{id}`,
`/api/tasks/{id}/webhook-regenerate`, `/api/tasks/{id}/run`), and
pins the handler-side token check so a refactor of the route doesn't
quietly turn the endpoint into a truly anonymous one.

Closes #621.
2026-06-02 11:23:40 +09:00
Mahdi Salmanzade
000bd6d1ab Add read-only companion endpoints (ping/info/owner-scoped models) (#863)
First, smallest cut of a LAN companion bridge (split out of #855 per review):
a thin, additive, read-only layer so a LAN client can discover what a server
offers. No new LLM logic; auth is enforced by the existing AuthMiddleware.

- GET /api/companion/ping  -- cheap auth-validated health check
- GET /api/companion/info  -- server identity + capability flags
- GET /api/companion/models -- the CALLER's own model endpoints

/models scopes to the caller's real owner (the token's owner for bearer callers)
plus legacy null-owner shared rows, mirroring owner_filter, and never returns
api_key material. The owner rule lives in two pure helpers (token_owner,
owner_can_see) with direct tests proving a token for owner A cannot see owner B's
rows and that null-owner rows don't widen access.
2026-06-02 11:20:53 +09:00
Kevin
1494a0b7ee fix: normalize JS static MIME types on Windows
Refs #802
2026-06-02 01:32:00 +02:00
Strahil Peykov
370fe6b501 Warn when localhost auth bypass is enabled 2026-06-01 23:08:01 +09:00
LittleLlama
74dedcad37 Remove duplicate tool index startup warmup
get_tool_index() calls index_builtin_tools() on first init
(src/tool_index.py:469-470), and _warmup_tool_index then calls it
explicitly right after. Every cold boot embeds all 58 built-in tools
twice and double-upserts them into the ChromaDB collection.

The remaining get_tools_for_query call still pre-warms the query path.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 23:07:42 +09:00
Jamieson O'Reilly
171c29dcf3 Fix email-thread HTML injection, attachment path traversal, and missing authz (#475)
Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):

- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
  inbound HTML through the allowlist sanitizer, but the two threaded paths
  (_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
  server-parsed `body_html` raw into the DOM. A crafted inbound email could
  inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
  if a deployment relaxes the script CSP). Now sanitized on all paths.

- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
  on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
  user-controlled folder/uid and no containment, so a folder like `../../tmp`
  could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
  single safe segment and asserts containment.

- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
  /api/rag/stats, /api/test/youtube, /api/test-research relied only on the
  global session check (any logged-in user). Now require_admin-gated.

- Defense-in-depth HTML escaping: session HTML export escapes the session name
  (routes/session_routes.py); the MCP OAuth page escapes the reflected Host
  header / server_id (routes/mcp_routes.py).

- Internal-tool token now compared with secrets.compare_digest (constant time)
  in core/middleware.py and app.py.

Adds regression tests in tests/test_security_regressions.py.
2026-06-01 22:20:17 +09:00
pewdiepie-archdaemon
8f93d44917 Validate internal tool owner attribution 2026-06-01 15:25:15 +09:00
pewdiepie-archdaemon
0888a3b3e6 Add native Windows compatibility layer 2026-06-01 15:09:47 +09:00
LittleLlama
7e7e441fec Re-enable VectorRAG init with lazy retry
Personal Docs (POST /api/personal/add_directory and friends) currently
returns HTTP 503 'RAG system is not available' for every request,
because get_rag_manager() and rag_manager are both hardcoded off. The
disablement was added when chromadb 1.4.1 / pydantic 2.12 were mutually
incompatible at the client init layer.

That compat issue is fixed in the current pins (chromadb 1.5.x +
pydantic 2.13.x). Verified by calling the original lazy initializer
against a running chroma server — VectorRAG instantiates, reports
healthy=True, and indexes successfully.

This change:

1. src/rag_singleton.py — replace the hardcoded `return None` in
   get_rag_manager() with the original lazy init body. Keeps the
   30s retry-throttle so a missing chroma server doesn't busy-retry
   on every request.

2. app.py — replace the parallel `rag_manager = None` /
   `rag_available = False` hardcoding with a get_rag_manager() call.
   Logs the resolved state at startup. If chroma isn't reachable yet,
   rag_manager stays None and personal-doc routes still return 503,
   but the *next* request will hit the retry-throttle path in
   get_rag_manager() and try to init again.

Doesn't touch requirements.txt. Repos using docker-compose get chroma
automatically; manual installs that want Personal Docs to work still
need to either pip install chromadb (full package) and run `chroma run`
or point at an external chroma instance via env. That can be a
follow-up README / requirements-optional note.
2026-06-01 14:32:13 +09:00
pewdiepie-archdaemon
1ce00b5dea Stop auto-adding Ollama endpoints 2026-06-01 11:52:49 +09:00
pewdiepie-archdaemon
67f1675130 Restrict credentialed CORS headers 2026-06-01 10:54:08 +09:00
pewdiepie-archdaemon
fc7f107b22 Improve Ollama setup and model endpoint handling 2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon
e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00