Commit Graph

749 Commits

Author SHA1 Message Date
Povilas Kirna
68eeb7841c ci: harden description checks — unfilled dropdowns, gameable test plans, non-issue links (#2099)
* ci: harden description checks (dropdown placeholder, how-to-test, link \b)

- issue: flag sections still showing the "-- Please Select --" dropdown
  placeholder (added in #2068) as a single comma-separated line item;
  presence-only checks previously let an un-chosen dropdown pass.
- PR: replace the numbered-step "How to Test" rule with a non-trivial
  content requirement (>=30 chars). The old /\d+\.\s*\S/ rule both
  false-failed prose/code-block test plans and was gamed by an empty
  "1. 2. 3." shell; the message now explains what detail to provide.
- PR: tighten the linked-issue regex to /#\d+\b/ so a hex colour like
  #1a2b3c no longer counts as an issue reference.

---------

Co-authored-by: Povilas Kirna <povilas.kirna@pebble.net>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 08:16:36 +02:00
Nicholai
4dc11cfe6b refactor(memory): canonicalize memory imports (#50) 2026-06-04 05:31:15 +01:00
Yuri
a2e691da2b fix(models): stabilize proxy endpoint refresh behavior
* fix: support large proxy model endpoint refresh

Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline.

Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails.

Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage.

* fix: preserve endpoint ping status semantics
2026-06-04 04:56:11 +01:00
Sushanth Reddy
eee2167502 Stop API key save() from writing other providers' keys as plaintext (#1944)
save() called load(), which DECRYPTS every stored key, then re-encrypted
only the key being saved and wrote the whole dict back. The other
providers' keys were thus persisted in plaintext; on the next load()
Fernet raised InvalidToken on them and they were silently dropped.

Add _load_raw() that returns the still-encrypted on-disk dict (reusing the
existing missing/corrupt-file guards) and have save() build on that, so
untouched providers keep their ciphertext. load() now also goes through
_load_raw(), keeping its behavior identical.

Fixes #1914

Co-authored-by: EkaTantra Dev <dev@ekatantra.com>
2026-06-04 04:47:13 +01:00
Afonso Coutinho
09fe308720 fix(auth): revoke API tokens when deleting users
* fix: revoke API bearer tokens when their owner is deleted

* Re-run CI

* Invalidate bearer-token cache on user delete so warmed cached tokens stop working
2026-06-04 04:44:34 +01:00
Marius Popa
666babfd58 fix(documents): refresh library counters after removal (#1924) 2026-06-04 04:42:23 +01:00
Rudy Wolf
1c43daa564 fix(compare): stop blind mode leaking model identities via session names (#1318)
Blind Compare anonymized the pane headers, but each pane still created a helper chat session named "[CMP] <real-model>" and GET /api/sessions returned the session's model field. So the sidebar and the session-list API let a user map "Model A" back to its real model before voting, defeating the blind test.

- Frontend (static/js/compare/index.js, panes.js): in blind mode, name helper sessions by their neutral slot ("[CMP] Model A") instead of the model, matching the existing blind pane labels.
- Backend GET /api/sessions (routes/session_routes.py): blank the model field for [CMP]-prefixed helper sessions via a new _public_model helper.
- Backend /api/compare/start (routes/compare_routes.py): name blind sessions by slot and withhold model_left/model_right/mapping from the blind response (revealed at /vote).
- Tests: tests/test_blind_compare_redaction.py.

Fixes #1285.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 04:39:01 +01:00
hawktuahs
3d8c364689 [Bash] Fix Windows cookbook background tasks (#676)
* Fix Windows cookbook background tasks

* Add Windows Cookbook reliability follow-ups
2026-06-04 04:30:01 +01:00
Wes Huber
0f7ea7a936 fix: add 'willing to fix' dropdown to bug report issue template (#2063)
* fix: add 'willing to fix' dropdown to bug report issue template

The feature request template has an 'Are you willing to implement
this?' dropdown but the bug report template was missing it, leaving
a plain textarea with a placeholder hint instead. Add a matching
dropdown for consistency.

Fixes #2059

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add '-- Please Select --' default option to match feature_request template

Rebased on #2068 and added the placeholder option for consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 04:25:04 +01:00
Paulo Victor Cordeiro
bd4067cf83 fix: guard remaining uid.decode() calls in auto-classify spam path (#1860)
Two more bare uid.decode() calls at lines 889 and 897 crash with
AttributeError when uid is already a string. Applies the same
isinstance guard used everywhere else in this function.
2026-06-04 04:06:10 +01:00
Afonso Coutinho
49c14af5c7 fix(calendar): scope CalDAV event lookup by calendar
* fix: CalDAV sync hijacks another user's event sharing a VEVENT uid

* Seed schema-valid dtstart/dtend in caldav uid-scope test fixture
2026-06-04 04:01:21 +01:00
.bulat
e340674c12 Persist user prefs atomically (#1840) 2026-06-04 03:55:22 +01:00
lekt8
ceb62385f1 Fetch full messages with BODY.PEEK[] so read_email works on iCloud IMAP (#1961) (#1963)
read_email, reply_to_email and download_attachment fetched the full message with
the legacy bare RFC822 item (UID FETCH <uid> (RFC822)). iCloud's IMAP server
silently ignores it — the fetch returns status OK but only (UID <uid>) with no
body tuple, so the parse reports 'Email not found with UID' even though the
message exists and list_emails (which uses RFC822.HEADER) shows it. Gmail honours
(RFC822), which is why it only reproduced on iCloud.

Switch the three full-message fetches to (BODY.PEEK[]), which iCloud and Gmail
both honour and which doesn't set \Seen. Response shape is unchanged (raw bytes
still at msg_data[0][1]), so parsing is unaffected; the RFC822.HEADER (listing)
and (UID) probe fetches are left as-is.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 03:53:14 +01:00
Ocean Bennett
a38df08a31 fix(tests): use current python for rag id stability (#1817) 2026-06-04 03:49:59 +01:00
nubs
37e1d401cf fix(tests): clean agent loop import stubs 2026-06-04 03:44:49 +01:00
Afonso Coutinho
03dbf976a5 fix: image model ranking crashes on a non-string search filter (#1898) 2026-06-04 03:26:35 +01:00
Afonso Coutinho
5043b2924c fix: image model ranking crashes when system is not a dict (#1900) 2026-06-04 03:23:59 +01:00
Alexandre Teixeira
be8f1fac85 fix(tests): add endpoint URLs to remaining session fixtures 2026-06-04 03:14:43 +01:00
Afonso Coutinho
eac354629a fix: model cost/info matches first substring key (gpt-4o-mini billed as gpt-4o) (#1439)
* fix: match model name to the longest known key, not the first substring

* test: model key matching prefers the longest specific key
2026-06-04 03:05:37 +01:00
raf
2efebcc278 fix(tests): allow multiple logout calls when IMAP fallback reconnects (#1976)
_latest_inbox_fallback_uids logs out the broken connection before
reconnecting. The outer finally then logs out the new
connection. Both logouts are correct, the test assertion of == 1
was written before the reconnect logic existed. Changed to >= 1.
2026-06-04 02:56:05 +01:00
ghreprimand
82fcec6bb6 Replace core database utcnow defaults (#1457)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-04 02:50:19 +01:00
Wes Huber
6e66e69451 fix(tests): add endpoint URL to archived session seeds
The sessions table now enforces NOT NULL on endpoint_url, but the
test fixture omitted it when seeding archived sessions, causing
IntegrityError on all three test cases.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-04 02:32:54 +01:00
Vykos
5f58f9a45f fix(ai): scope tool model resolution by owner
* Stabilize full test collection

* Scope AI tool model resolution by owner
2026-06-04 00:37:28 +01:00
Vykos
aaef6b1c49 fix(search): align content URL guards
* Stabilize full test collection

* Align search content URL guards
2026-06-04 00:34:06 +01:00
Vykos
193dc2f085 fix(uploads): bound direct upload reads
* Stabilize full test collection

* Add bounded reads for direct uploads
2026-06-04 00:32:50 +01:00
pewdiepie-archdaemon
48f5182286 Merge branch 'codex-on-main' 2026-06-04 08:27:41 +09:00
Vykos
5869106089 test: stabilize full test collection 2026-06-04 00:27:29 +01:00
pewdiepie-archdaemon
089246614d feat: Claude Agent integration + cookbook reconnect + UI polish
- Claude Agent integration: AGENT_CONFIGS.claude, INTG_TYPES.claude,
  setup_claude_routes + integrations/claude/ skill bundle. Wired in
  app.py alongside the existing Codex integration; same scope-gated
  /api/codex/* backend; agent form has new description so users know
  it's setup for an external CLI, not an agent streamed inside Odysseus.
- Remove mark_email_boundaries action: not good enough yet. Stripped
  from task UI, scheduler defaults, registry, tool schema, clear-cache
  route. Added to RETIRED_HOUSEKEEPING_ACTIONS so existing rows + their
  task_runs auto-purge on startup.
- Cookbook download reliability: "Reconnect" fix button in the crash
  diagnosis runs _reconnectTask after probing has-session. 30s confirm
  window before marking a download "done" — kills the Finished/Downloading
  flicker when tmux briefly drops between captures.
- Mobile UX: tap anywhere on a note card body opens the editor;
  Update button morphs to Archive when no text was edited; bell icon
  accent-colored; chip-trashing notif pills fade so only the icon
  rotates into the trash zone.
- Settings integrations: SVG-per-provider in email + API preset
  dropdowns, custom drop-up-aware menus, accent sub-header icons
  (IMAP/SMTP), consistent card styling between list + edit, contacts
  Edit/Delete icons, agent form description copy.
2026-06-04 08:27:26 +09:00
Mahdi Salmanzade
271489a10c fix(research): owner-scope endpoint resolution
POST /api/research/start (require_privilege "can_use_research" — a normal
user, not admin) resolves an endpoint two ways and feeds the row's *decrypted*
api_key + base_url into research_handler.start_research(llm_endpoint=,
llm_headers=):

  1. body.endpoint_id  -> query(ModelEndpoint).filter(id == endpoint_id,
                          is_enabled == True).first()
  2. no endpoint + nothing configured -> query(ModelEndpoint).filter(
                          is_enabled == True).first()

Neither was owner-scoped. ModelEndpoint is a per-user resource (core/database.py:
non-null owner = private, "the model picker only shows the endpoint to that
user"). So a research-privileged user (or a chat-scoped token) could pass another
user's PRIVATE endpoint_id — or fall through to their first-enabled row — and run
research against that owner's endpoint: spending their API key / quota and
reaching whatever internal base_url they configured (SSRF).

This is the same multi-tenant owner-scoping class already fixed for
companion/models, the /api/v1/chat session gate (#870), and the /api/v1/chat
first-enabled fallback (#1045, _first_enabled_endpoint). These two sinks on the
research path were missed.

Extract `_owned_enabled_endpoint(db, owner, endpoint_id=None)` which scopes via
the shared owner_filter helper (own rows + legacy null-owner shared rows),
matching webhook_routes._first_enabled_endpoint and session_routes._owned_endpoint.
Used for both sinks. A scoped miss on the explicit-id path returns the existing
404 ("Endpoint not found or disabled"), so endpoint existence isn't revealed. A
null/empty owner stays a no-op (single-user / legacy mode).

Add regression tests pinning both lookups (cross-owner rejected, own-row
allowed, legacy shared-row allowed, disabled-skipped, fallback never borrows,
null-owner no-op).
2026-06-03 23:19:28 +01:00
Mahdi Salmanzade
729a30a10e fix(compare): owner-scope endpoint key lookup
POST /api/compare/start (a normal-user route — no admin gate) creates two
caller-owned [CMP] sessions from caller-supplied endpoint URLs (endpoint_a /
endpoint_b), then copies a ModelEndpoint's *decrypted* api_key into each
session's headers by matching on URL:

    ep = db.query(ModelEndpoint).filter(ModelEndpoint.base_url == base).first()

The match was not owner-scoped. ModelEndpoint is per-user (core/database.py:
non-null owner = private, "the model picker only shows the endpoint to that
user"). So a user could pass another user's endpoint base_url, have that owner's
api_key copied into a [CMP] session they own, then drive /api/chat_stream on that
session — spending the victim's API key / quota and reaching whatever base_url
they configured. Same multi-tenant owner-scoping class already fixed for
companion/models, /api/v1/chat (#870, #1045), session create/switch-model
(#1093), and /api/research/start (#1099).

Extract `_owned_endpoint_by_url(db, base_url, owner)` which scopes the match via
the shared owner_filter helper (own rows + legacy null-owner shared rows),
mirroring session_routes._owned_endpoint. A scoped miss copies no key (the
comparison session simply carries no borrowed credential). A null/empty owner
stays a no-op (single-user / legacy mode).

Add regression tests pinning the scoped match (cross-owner rejected, own-row
allowed, legacy shared-row allowed, no-match None, null-owner no-op).
2026-06-03 23:17:12 +01:00
Afonso Coutinho
b6607d219d fix(memory): owner-scope memory route session access 2026-06-03 23:13:56 +01:00
Sushanth Reddy
c58cb067f2 fix(calendar): avoid double-encrypting CalDAV password
cfg is loaded from prefs and already holds the existing, already-encrypted
password. When the edit form was re-submitted without re-typing the
password, the elif branch called encrypt() on that stored ciphertext,
compounding the encryption on every save and eventually breaking sync with
a decrypt error.

Drop the elif branch: the stored value is preserved as-is, and we only
encrypt when a new password is actually supplied.

Fixes #1915

Co-authored-by: EkaTantra Dev <dev@ekatantra.com>
2026-06-03 22:59:40 +01:00
Povilas Kirna
7c7ac1021a ci: enforce issue/PR description completeness for template-bypassing submissions (#1959)
* ci: add issue/PR description completeness checks (#1958)

Two github-script workflows that validate description structure on
issue/PR open/edit/reopen, for submissions that bypass the browser
template (API, gh CLI, agent bulk PRs).

- PR check: Summary, Linked Issue, Type of Change, duplicate-search
  box, How to Test.
- Issue check: body length + per-label bug/enhancement fields, plus a
  bug+enhancement conflict guard.
- Pass deletes any prior bot comment and applies `ready for review`;
  fail posts an in-place comment, fails the check, and applies
  `needs work` (PRs) / `needs more info` (issues).
- References existing labels only — never creates or recolours repo
  labels (checks existence first, warns and skips if absent).
- Safe pull_request_target: checkout pinned to the base ref, sparse
  `.github/scripts` only; PR head never checked out.

Closes #1958
Co-authored-by: Povilas Kirna <povilas.kirna@pebble.net>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 16:58:10 +02:00
Glenn
b5590fd008 feat: add placeholder option for dropdowns in issue templates (#2068) 2026-06-03 16:33:26 +02:00
pewdiepie-archdaemon
67b63e9844 Revert "fix(ui): allow manual prompt bar resize (#1201)"
This reverts commit 258e6fc0d4.
2026-06-03 23:04:28 +09:00
pewdiepie-archdaemon
6e80d0de08 Revert "fix(ui): allow manual prompt bar resize (#1201)"
This reverts commit 258e6fc0d4.
2026-06-03 23:03:58 +09:00
pewdiepie-archdaemon
5939aec69f Codex Agent integration: HTTP surface + plugin bundle + Settings UI
This persists work that had been living only in the cookbook docker
container's writable layer — never committed to the host source. Brought
back to git intact, app.py registration re-applied surgically on top of
current main (not the older container copy, which would have regressed
the Windows MIME fix, asynccontextmanager lifespan, and webhook auth
exempts).

routes/codex_routes.py (new):
- GET  /api/codex/capabilities  — what this Odysseus exposes.
- GET  /api/codex/plugin.zip    — downloads integrations/codex as a zip.
- GET  /api/codex/todos         — scope-gated todos:read|write.
- POST /api/codex/todos         — scope-gated todos:write.
- GET  /api/codex/emails        — scope-gated email:read|draft|send.
- GET  /api/codex/emails/{uid}  — single-message fetch.
- _scope_owner() enforces api_token scopes before touching user data.

routes/api_token_routes.py (+103 lines):
- Adds Codex-token-specific issuance + revocation paths.

integrations/codex/ (new bundle, shipped via /api/codex/plugin.zip):
- README.md                       — install instructions.
- .codex-plugin/plugin.json       — Codex plugin manifest.
- scripts/odysseus_api.py         — Python client used by the skill.
- skills/odysseus/SKILL.md        — Codex skill definition.

static/js/settings.js (+253 lines):
- New "Codex Agent" option in the Integrations dropdown.
- Add / edit panel with plugin-bundle download link + curl-with-token
  install instructions per agent.

app.py:
- 7-line surgical change: capture email_router = setup_email_routes()
  and register setup_codex_routes(email_router=email_router) after the
  email module so the Codex routes can borrow its helpers.
2026-06-03 22:49:09 +09:00
pewdiepie-archdaemon
1f6c5ac66b Revert "Codex Agent integration: HTTP surface + plugin bundle + Settings UI"
This reverts commit 8c2705b42a.
2026-06-03 22:47:00 +09:00
pewdiepie-archdaemon
6861c41580 Reapply "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit cc8fe2f6e3.
2026-06-03 22:47:00 +09:00
pewdiepie-archdaemon
cc8fe2f6e3 Revert "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit 8161c1253d, reversing
changes made to 8c2705b42a.
2026-06-03 22:46:19 +09:00
pewdiepie-archdaemon
8161c1253d Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus 2026-06-03 22:39:33 +09:00
pewdiepie-archdaemon
8c2705b42a Codex Agent integration: HTTP surface + plugin bundle + Settings UI
This persists work that had been living only in the cookbook docker
container's writable layer — never committed to the host source. Brought
back to git intact, app.py registration re-applied surgically on top of
current main (not the older container copy, which would have regressed
the Windows MIME fix, asynccontextmanager lifespan, and webhook auth
exempts).

routes/codex_routes.py (new):
- GET  /api/codex/capabilities  — what this Odysseus exposes.
- GET  /api/codex/plugin.zip    — downloads integrations/codex as a zip.
- GET  /api/codex/todos         — scope-gated todos:read|write.
- POST /api/codex/todos         — scope-gated todos:write.
- GET  /api/codex/emails        — scope-gated email:read|draft|send.
- GET  /api/codex/emails/{uid}  — single-message fetch.
- _scope_owner() enforces api_token scopes before touching user data.

routes/api_token_routes.py (+103 lines):
- Adds Codex-token-specific issuance + revocation paths.

integrations/codex/ (new bundle, shipped via /api/codex/plugin.zip):
- README.md                       — install instructions.
- .codex-plugin/plugin.json       — Codex plugin manifest.
- scripts/odysseus_api.py         — Python client used by the skill.
- skills/odysseus/SKILL.md        — Codex skill definition.

static/js/settings.js (+253 lines):
- New "Codex Agent" option in the Integrations dropdown.
- Add / edit panel with plugin-bundle download link + curl-with-token
  install instructions per agent.

app.py:
- 7-line surgical change: capture email_router = setup_email_routes()
  and register setup_codex_routes(email_router=email_router) after the
  email module so the Codex routes can borrow its helpers.
2026-06-03 22:38:05 +09:00
Alexandre Teixeira
b1a4ed13b0 Harden API-token chat endpoint selection
Validate only token-supplied direct base_url values for API-token chat requests, while keeping admin-configured endpoints available for local/LAN providers.

Scope configured endpoint fallback selection to the API token owner, fail closed for unknown token owners, and preserve strict session ownership checks when resuming sessions from chat-scoped API tokens.

Add focused regression coverage for direct base_url SSRF rejection, configured endpoint fallback behavior, token-owner scoping, URL validation, and null-owner session/endpoint handling.
2026-06-03 13:05:13 +01:00
Alexandre Teixeira
145f4fd2b4 feat(models): support pinned endpoint model IDs 2026-06-03 13:00:07 +01:00
Alexandre Teixeira
1284b14a13 feat(docker): add standalone GPU compose files for stack UIs 2026-06-03 12:54:35 +01:00
Alexandre Teixeira
a75dd4a231 fix(search): apply recency UTC fix to live ranking module 2026-06-03 12:49:32 +01:00
Alexandre Teixeira
0deeba58ba tests(llm): cover Anthropic temperature clamping 2026-06-03 12:28:53 +01:00
pewdiepie-archdaemon
562bc4dedc Cookbook polish: auto-reconnect, ctx slider fixes, scoring, lots of UI
Backend (services/hwfit + routes):
- VRAM column sort now shows global highest first (was special-cased to
  ascending then truncated top-N, which made "highest VRAM" mathematically
  unreachable). Every column path uses reverse=True for the truncation.
- Hardware probe cache TTL 30min -> 24h so changing filters doesn't keep
  re-probing the rig during a session; Rescan button still forces fresh.
- Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang can't serve them);
  default non-prequantized to BF16 on 2+ GPUs.
- AWQ / AWQ-8bit / GPTQ-8bit get a -1.0 quality penalty so FP8 wins ties.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above M2.5.
- hf_models.json: zai-org/GLM-5.1 added; zai-org/GLM-5 quantization flipped
  Q4_K_M -> BF16. DeepSeek-V4-Flash / -Pro + their -Base variants registered
  with new FP4-MoE-Mixed / FP8-Mixed quant keys (calibrated BPP from the
  actual 156 GB / 284 GB disk footprints).
- New FP4-MoE-Mixed + FP8-Mixed entries in QUANT_BPP / QUANT_SPEED_MULT /
  QUANT_QUALITY_PENALTY / QUANT_BYTES_PER_PARAM / PREQUANTIZED_PREFIXES.

Frontend — Scan/Download:
- Engine + Quant swapped in the toolbar; Quant defaults to "All".
- Ctx (range slider) ported from origin/main: 8k/16k/32k/50k/128k/Max. Drag
  re-sorts by vram ascending (smallest fitting first); back to Max → score.
- Ctx slider rail now visible — was background:transparent in a duplicate
  later-cascade rule. Hardcoded grey + !important.
- Search input moved to the far right of the toolbar.
- Type/Standard default; "Context" not uppercased; Search placeholder dimmed.
- Engine "?" + Quant "?" inline help chips inside their dropdown boxes.
- Fit-column dot toggles fit-only filter; un-toggling re-sorts by VRAM desc.
- Quant column truncates to 9 chars + ellipsis ("FP4-MoE-M..."), full in
  tooltip. Smart title-suffix strips the parts already in the repo name
  (QuantTrio/MiniMax-M2-AWQ + quant AWQ-4bit -> just "(4bit)").
- Conditional warning for safetensors models on non-GPU rigs only.
- Dependency Install / Installed / Installed▾ / N/A all 75.85px wide.
- Rebuild llama.cpp moved into the llama_cpp dep row, styled as a tag.
- Foldable Download admin-card (h2 chevron); line under h2 only when folded.
- HF token save gets a green ✓ + "Saved" flash.
- Cached scan no longer counts stalled rows as downloaded.
- Footer: "Request it →" link with GitHub mark to the public discussion
  (#1962) for model-add requests.

Frontend — Running tab:
- Strict download-finish check (DOWNLOAD_OK or /snapshots/, not bare
  "Download complete"). True overall % for multi-shard downloads:
  ((N-1)+frac)/total instead of hf_transfer's per-shard aggregate.
- ETA in the uptime ticker: "downloading: 12m 34s · ETA 1h 23m".
- Clear button kills the tmux session too; if the output still shows a
  live shard line, the pill is hidden + relabels as "reconnect" + revives
  on click.
- Self-heal: on cookbook open AND every bg-monitor cycle (10s, throttled
  to 8s), scan persisted done/error/crashed downloads and probe their
  tmux session — if alive, flip status back to running and reattach.
- Per-launch zombie probe: clicking Download on a model whose persisted
  state is done but tmux is still alive revives the existing task and
  refuses to start a duplicate.
- Pre-launch GPU probe: vllm / sglang / diffusers serve check
  /api/cookbook/gpus first; warns + confirms if no GPU is visible.
- Server-side state guard: rejects "done" POSTs for downloads lacking
  DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
  shard is N<total — stale tabs can't poison persisted state any more.
- Running count includes tasks whose output looks active even if persisted
  status got stuck. Dir text on the running row, font matched to uptime.

Serve panel:
- Ctx text input always resets to model max on open (default 20000 when
  metadata is missing).
- Max Seqs default 8 -> 4. KV Cache dtype select 32px tall.
- Lightning icon on Launch (same as Action toggle).
- Diagnosis card simplified (no fold/copy/dismiss), suggestion font
  matches body; action buttons get icons on the left (Retry/Copy/Edit/
  Install/Kill/Switch/etc.).
- Incomplete-download serve warning when model status is
  downloading / stalled / has_incomplete.
- MTP "?" tooltip ("supported on a few model families … up to ~3× faster").
2026-06-03 20:25:25 +09:00
pewdiepie-archdaemon
3706d756f3 Merge remote-tracking branch 'origin/main' into visual-pr-playground
# Conflicts:
#	routes/cookbook_routes.py
#	routes/hwfit_routes.py
#	services/hwfit/fit.py
#	services/hwfit/models.py
#	static/js/cookbook-diagnosis.js
#	static/js/cookbook-hwfit.js
#	static/js/cookbook.js
#	static/js/cookbookRunning.js
2026-06-03 16:49:10 +09:00
pewdiepie-archdaemon
eb79b76432 Cookbook: scoring fixes, UI polish, false-finished + stale-state bug fixes
Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
  sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
  cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
  FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
  M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
  the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
  killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
  "stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
  DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
  shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
  Q4_K_M -> BF16 (it is the native base, not a quant).

Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
  50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
  on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
  column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
  state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
  task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
  cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
  required; bare "Download complete" no longer marks the task done after
  the first config file. Clear button now sends tmux kill-session too.
  True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
  hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
  Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.

CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
2026-06-03 16:32:20 +09:00