Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):
- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
inbound HTML through the allowlist sanitizer, but the two threaded paths
(_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
server-parsed `body_html` raw into the DOM. A crafted inbound email could
inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
if a deployment relaxes the script CSP). Now sanitized on all paths.
- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
user-controlled folder/uid and no containment, so a folder like `../../tmp`
could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
single safe segment and asserts containment.
- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
/api/rag/stats, /api/test/youtube, /api/test-research relied only on the
global session check (any logged-in user). Now require_admin-gated.
- Defense-in-depth HTML escaping: session HTML export escapes the session name
(routes/session_routes.py); the MCP OAuth page escapes the reflected Host
header / server_id (routes/mcp_routes.py).
- Internal-tool token now compared with secrets.compare_digest (constant time)
in core/middleware.py and app.py.
Adds regression tests in tests/test_security_regressions.py.
chat_routes.py persisted a session's "mode" in three best-effort spots —
reading the current mode, writing the effective mode, and setting
research_pending on the stream path. Each opened a session with SessionLocal()
and called .close() as the LAST statement inside a try/except, so if anything
before close() raised (e.g. a SQLite "database is locked" under concurrent chat
streams) the except only logged and the connection was never returned to the
pool.
DATABASE_URL defaults to file-backed SQLite, whose engine uses SQLAlchemy's
default QueuePool (5 connections + 10 overflow). Repeated leaks on these hot
paths exhaust the pool; later requests then block for pool_timeout and fail
with "QueuePool limit ... reached", taking the app down until restart.
Move the logic into two best-effort helpers in core.database, next to the
existing session helpers (update_session_last_accessed, get_session_by_id):
- get_session_mode(session_id) -> Optional[str]
- set_session_mode(session_id, mode) -> bool
Both route through the existing get_db_session() context manager, which commits
on success, rolls back on error, and always closes in a finally, so the
connection is returned to the pool on every path. chat_routes.py now calls
these instead of hand-rolling sessions, also removing three copies of the same
try/except.
Add tests/test_session_mode_helpers.py: the helpers commit+close on success
and, on a mid-operation DB error, swallow + roll back + close (no leak). The
error-path tests fail against the old close()-inside-try pattern.