Merge remote-tracking branch 'origin/codex/production-intelligence-terminal' into codex/issue-22-source-fetch-instrumentation
All checks were successful
Codex Template Compliance / template-compliance (pull_request) Successful in 5s
Build / test-and-image (pull_request) Successful in 57s

# Conflicts:
#	test/fetch-utils.test.mjs
This commit is contained in:
MrSphay
2026-05-17 20:52:27 +02:00
7 changed files with 623 additions and 46 deletions

View File

@@ -134,6 +134,9 @@ STALE_ALERT_COOLDOWN_MINUTES=60
DASHBOARD_URL=https://intelligence.example.internal
TERMINAL_ACTIONS_ENABLED=true
SWEEP_TOKEN=
SSE_HEARTBEAT_INTERVAL_MS=25000
TERMINAL_ACTION_RATE_LIMIT_WINDOW_MS=60000
TERMINAL_ACTION_RATE_LIMIT_MAX=10
BRIEF_VERBOSITY=standard
LLM_PROVIDER=openrouter
@@ -185,9 +188,66 @@ LLM_MODEL=your-model
For Pangolin or another reverse proxy, forward HTTP traffic to `intelligence-terminal:3117` (or the `PORT` you set). Missing API keys do not crash sweeps; affected sources are reported as degraded in `/api/health`.
#### Terminal Action Exposure
`POST /api/action` and `POST /api/sweep` can trigger operational actions such as manual sweeps. The dashboard has a **SET TOKEN** control that stores your `SWEEP_TOKEN` in browser local storage and sends it as the `x-crucix-token` header; do not put action tokens in URLs.
Recommended settings:
| Deployment | Settings |
| --- | --- |
| Private local machine | `NODE_ENV=development`, optional `SWEEP_TOKEN`, optional `TERMINAL_ACTIONS_ENABLED=true`. Localhost can run actions without a token for development. |
| Private LAN / Dockge | Set a strong `SWEEP_TOKEN`, keep `TERMINAL_ACTIONS_ENABLED=true`, expose only to trusted clients. |
| Pangolin-authenticated reverse proxy | Set a strong `SWEEP_TOKEN`, keep Pangolin auth in front, use the dashboard **SET TOKEN** flow once per browser. |
| Public internet | Do not expose Terminal Actions directly. If exposure is unavoidable, require `SWEEP_TOKEN`, keep proxy authentication enabled, lower `TERMINAL_ACTION_RATE_LIMIT_MAX`, and monitor server audit logs. |
Action endpoints reject cross-origin POST origins, apply a small in-memory per-IP rate limit, and write sanitized audit lines without logging the token.
When data remains stale past `STALE_DATA_MAX_AGE_MINUTES`, the server sends an operator alert through configured Telegram/Discord channels after failed or degraded sweep attempts. `STALE_ALERT_COOLDOWN_MINUTES` prevents repeated stale alerts from spamming every refresh interval. Set `DASHBOARD_URL` to the Pangolin/public URL you want included in those alerts.
The dashboard Terminal Actions panel can trigger `status`, `sweep`, and `brief` through `/api/action`. Leave `TERMINAL_ACTIONS_ENABLED=true` for a private home-server deployment. For an internet-exposed deployment, set `SWEEP_TOKEN` and pass it through trusted automation, or set `TERMINAL_ACTIONS_ENABLED=false` to disable browser-triggered actions. If you protect actions with `SWEEP_TOKEN`, the browser can send it from `localStorage.crucix_sweep_token`.
#### Memory And Prediction Loop
Crucix stores longitudinal memory in `runs/intelligence.db` when the current Node.js build exposes `node:sqlite`. If SQLite is unavailable, the file is created as a harmless placeholder and `/api/health` reports the memory store as unavailable instead of failing the sweep.
The memory layer persists:
| Table | Purpose |
| --- | --- |
| `runs` | Sweep timestamps, source health counts, and delta direction summaries. |
| `entities` | Stable entity IDs for recurring countries, regions, and locations. |
| `events` | Stable event IDs for conflict, OSINT, urgent news, and new delta signals across sweeps. |
| `predictions` | Trade/intelligence hypotheses with evidence, confidence, horizon, outcome state, and latest grading. |
Query endpoints:
```text
GET /api/memory/search?q=iran&limit=25
GET /api/memory/predictions?state=open&limit=25
```
Memory endpoints use the same operator authorization gate as Terminal Actions. The dashboard Terminal Actions panel includes a `Memory` action for a quick operator-facing view of recent events and prediction states.
Retention, backup, and privacy expectations:
- Treat `runs/intelligence.db` as operator data. It can contain source excerpts, headlines, generated hypotheses, and URLs from your configured feeds.
- Back up `runs/` with the rest of your Dockge volume if you want longitudinal learning to survive container replacement.
- Delete `runs/intelligence.db` to reset SQLite memory; the next sweep recreates the schema.
- Do not commit `runs/` or `.env`. API credentials stay in `.env`; memory stores derived observations, not secrets.
- If you expose the dashboard through a reverse proxy, protect Terminal Actions and memory queries behind your normal authentication boundary.
#### Reverse Proxy SSE
The dashboard receives live sweep updates from `GET /events` using Server-Sent Events. The server sends `retry: 10000` reconnect guidance and lightweight heartbeat comments every `SSE_HEARTBEAT_INTERVAL_MS` milliseconds so reverse proxies do not close an otherwise idle stream between 15-minute sweeps.
Recommended proxy settings:
| Proxy | Setting |
| --- | --- |
| Pangolin / Traefik-style frontends | Keep response streaming enabled and set idle timeouts above `SSE_HEARTBEAT_INTERVAL_MS`. |
| Nginx | Disable proxy buffering for `/events`, keep `proxy_read_timeout` above the heartbeat interval, and preserve `Connection: keep-alive`. |
| Cloudflare-style proxies | Keep the heartbeat below common idle cutoffs; the default 25s is intentionally conservative. |
If you raise the heartbeat interval, keep it shorter than the lowest idle timeout in the proxy chain.
`/api/metrics` includes network health grouped by host and source/provider. Source modules should use `safeFetch(url, { source: 'SourceName' })`; when omitted, the shared helper infers a stable provider bucket from the URL host instead of grouping normal source traffic under `unknown`. Raw fetch exceptions are documented in [Source Fetch Instrumentation](docs/source-fetch-instrumentation.md).