Files
odysseus/.env.example
tanmayraut45 f59edee611 Support extra CA bundle for private-CA LLM providers (#769)
Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a
model endpoint fails on first probe with

    CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate
    chain (_ssl.c:1000)

because their TLS chain is signed by a private root CA (Russian Trusted
Root CA for GigaChat; corporate CA for on-prem) that isn't part of the
default system / certifi trust store. The endpoint shows offline in
the picker even though the URL and API key are correct (issue #722).

The right fix is to extend the trust store, not to weaken verification.
This change:

- src/tls_overrides.py: new module that resolves an opt-in env var
  LLM_CA_BUNDLE at import time, builds a shared SSLContext via
  ssl.create_default_context() (so the system / certifi bundle is
  loaded first) and layers the operator's PEM on top with
  load_verify_locations(). Exposes llm_verify() returning a value
  suitable for httpx `verify=`. Defaults to True (httpx built-in
  trust) when the env var is unset, when the file is missing, or
  when the PEM fails to load — verification is never silently
  disabled, the warning is logged and we fall back to the safe path.

- src/llm_core.py: thread llm_verify() into the shared AsyncClient
  used by stream_llm / streaming completions.

- routes/model_routes.py: thread llm_verify() into the five httpx.get
  call sites in _probe_endpoint / _ping_endpoint so adding a
  private-CA endpoint goes green on the very first probe and the
  picker stops showing it offline.

- .env.example: document LLM_CA_BUNDLE with the GigaChat case as the
  concrete example.

Deliberately NOT included: a verify=False knob (global or per-host).
Disabling verification exposes the affected endpoint to MITM, and the
operator-supplied bundle is the correct fix for legitimate private-CA
providers — so the only switch in this PR is the safe one.

Closes #722.
2026-06-04 13:18:50 +01:00

168 lines
6.8 KiB
Plaintext

# Odysseus UI — Environment Configuration
# Copy this file to .env and fill in your values.
# ============================================================
# LLM Configuration
# ============================================================
# Primary LLM host (default: localhost)
LLM_HOST=localhost
# Additional LLM hosts, comma-separated (for model discovery)
# Use hostnames/IPs only; Odysseus scans common serve ports, including Ollama's 11434.
# LLM_HOSTS=llm-host.local,backup-llm.local
# Optional Ollama base URL. In Docker, host Ollama is usually reachable here
# when started with OLLAMA_HOST=0.0.0.0:11434.
# OLLAMA_BASE_URL=http://host.docker.internal:11434/v1
# Optional LM Studio URL. In Docker, host LM Studio is reachable here
# when LM Studio is set to serve on all interfaces (0.0.0.0).
# LM_STUDIO_URL=http://host.docker.internal:1234
# OpenAI API key (only needed if using OpenAI models).
# Do not commit real keys. Keep this commented until needed.
# OPENAI_API_KEY=your_openai_api_key_here
# Research service LLM endpoint
# RESEARCH_LLM_ENDPOINT=http://localhost:8000/v1/chat/completions
# Extra CA bundle for LLM providers whose TLS chain isn't in the default
# trust store. Layered ON TOP of the system / certifi bundle — verification
# stays on for every host, the trust set just gets larger. Useful for:
# - GigaChat / Sber (Russian Trusted Root CA): without this the endpoint
# shows offline with CERTIFICATE_VERIFY_FAILED — self-signed certificate
# in certificate chain.
# - On-premise / corporate LLM gateways with an internal CA.
# Point at a PEM file containing the missing root(s).
# LLM_CA_BUNDLE=/etc/odysseus/ca/extra-roots.pem
# ============================================================
# Search & Web
# ============================================================
# SearXNG instance URL (self-hosted, for web search).
# Docker Compose overrides this to http://searxng:8080 for in-network access.
SEARXNG_INSTANCE=http://localhost:8080
# Optional SearXNG cookie/CSRF secret. If blank, Docker generates one on first boot
# and stores it in the searxng-data volume.
# SEARXNG_SECRET=
# ============================================================
# Database
# ============================================================
# SQLite database path (default: sqlite:///./data/app.db)
# DATABASE_URL=sqlite:///./data/app.db
# ============================================================
# Auth & Security
# ============================================================
# Enable authentication (default: true)
# AUTH_ENABLED=true
# Host bind address and port for the Odysseus web UI in Docker Compose.
# Keep APP_BIND on loopback unless you intentionally want LAN/reverse-proxy access.
# APP_BIND=127.0.0.1
# Change this if another local service already uses 7000 (macOS AirPlay often does).
# APP_PORT=7000
# Development-only auth bypass for loopback requests.
# Keep false for Docker, LAN, reverse proxy, and any shared deployment.
# LOCALHOST_BYPASS=false
# Mark session cookies Secure. Set true when Odysseus is served through HTTPS
# by a trusted reverse proxy or private access gateway.
# SECURE_COOKIES=true
# Optional: pre-seed the first admin password during setup.
# Do not commit a real password.
# ODYSSEUS_ADMIN_PASSWORD=change_me_before_first_boot
# CORS allowed origins (default: localhost-only; restrict to your public origin in production)
# ALLOWED_ORIGINS=http://localhost:7000,http://localhost:8000
# ============================================================
# ChromaDB (vector store)
# ============================================================
# ChromaDB service host.
# Manual host run: localhost:8100 when using `docker run -p 8100:8000 chromadb/chroma`.
# Docker Compose overrides these to chromadb:8000 for in-network access.
# CHROMADB_HOST=localhost
# CHROMADB_PORT=8100
# Docker Compose host-port bind addresses for bundled services.
# Defaults are loopback-only for safety. To expose ntfy only on Tailscale,
# set NTFY_BIND to your host's Tailscale IP and update NTFY_BASE_URL.
# CHROMADB_BIND=127.0.0.1
# NTFY_BIND=127.0.0.1
# NTFY_BASE_URL=http://localhost:8091
# Example:
# NTFY_BIND=100.x.y.z
# NTFY_BASE_URL=http://100.x.y.z:8091
# ============================================================
# RAG / Embeddings
# ============================================================
# Embedding API endpoint (OpenAI-compatible /v1/embeddings)
# Default: http://{LLM_HOST}:11434/v1/embeddings (ollama)
# EMBEDDING_URL=http://localhost:11434/v1/embeddings
# Embedding model name (must be available at the endpoint above)
# EMBEDDING_MODEL=all-minilm:l6-v2
# Local fallback embedding model (used when no HTTP embedding API is available)
# Uses fastembed (ONNX) — downloads model on first run (~50MB)
# FASTEMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2
# FASTEMBED_CACHE_PATH= # defaults to ~/.cache/fastembed
# ============================================================
# Misc
# ============================================================
# Cleanup interval in hours (default: 24)
# CLEANUP_INTERVAL_HOURS=24
# In-process email pollers (default: on). Set to 0 if you're driving
# polling from cron / systemd via `scripts/odysseus-mail poll-scheduled`
# and `scripts/odysseus-mail poll-summary`, otherwise both schedulers
# race on the same SQLite.
# ODYSSEUS_INPROCESS_POLLERS=1
# In-process scheduled-task runner (default: on). Set to 0 to let an
# external driver fire scheduled tasks. Calendar reminders are
# frontend-driven (polling /api/notes from the browser) so no gate is
# needed there.
# ODYSSEUS_INPROCESS_TASKS=1
# Host used by the built-in "run_script" scheduled-task action.
# Empty/local/localhost runs scripts on the app host. Set to an SSH host alias
# if you intentionally want scheduled scripts to run remotely.
# ODYSSEUS_SCRIPT_HOST=localhost
# ============================================================
# GPU support (Docker Compose)
# ============================================================
# Pass the host GPU into the odysseus container. Default (unset) = CPU.
# COMPOSE_FILE is a native `docker compose` feature: a colon-separated
# list of files merged left-to-right. Pick ONE GPU line below, or leave
# all commented for CPU.
#
# NVIDIA (requires nvidia-container-toolkit + `nvidia-ctk runtime
# configure --runtime=docker` on the host):
# COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
# COMPOSE_FILE=docker-compose.yml;docker/gpu.nvidia.yml #(Windows)
#
# AMD ROCm (requires ROCm drivers on the host and the GID of the render group):
# COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
# Find the render GID with: getent group render | cut -d: -f3
# RENDER_GID=989
#
# These overlays only expose the GPU devices. The slim Odysseus image
# still needs CUDA/ROCm userspace via Cookbook -> Dependencies (vLLM,
# llama-cpp-python, etc.) before models can actually serve on GPU.