Polish task UI slash commands and Ollama serving

2026-06-02 09:36:03 +09:00
parent ab0a480f30
commit 96618b01c0
9 changed files with 155 additions and 45 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -8,25 +8,54 @@ the codebase, you are probably right to stay away.
 ## High Priority

 - SQUASH BUGS
- Fresh Docker install smoke tests on Linux, macOS, and Windows!!
+- Fresh install smoke tests on Linux, macOS, and Windows. Docker, native Python,
+  and WSL all need coverage.

 - Integration audit: do integrations even work? Confirm what works, what needs setup docs, and what should be removed or hidden. 
 - Self-host troubleshooting cookbook. Document the weird 30-second fixes that otherwise become 30-minute searches: Dovecot cleartext auth for local stacks, ntfy Android Instant Delivery for non-ntfy.sh servers, clipboard limits on plain-HTTP Tailscale URLs, Radicale collection URLs, and similar traps.
 - Cookbook reliability on other computers. This is probably the area most likely to need work across different machines, GPUs, drivers, shells, and Python environments.
- Tile/window management correctness. I had to brute force my way a bit here, I'm aware, popups, dropdowns, and fixed-position UI inside transformed modals can land in the wrong place.
- Esc button, it's small but a lot of windows that arent still close on esc and alot of them doesnt. 
- Skill audit, how does your model respond to skill injection, does it follow? Does its parsing miss? 
+- Cookbook SGLang support across platforms. Make sure SGLang setup/serve works
+  predictably on Linux, Windows/WSL, macOS where possible, Docker, and common
+  NVIDIA/AMD hardware paths.
+- Deep Research model presets by hardware. Recommend approved model/parameter
+  profiles for small, medium, and large local setups so people with different
+  hardware can use Deep Research without guessing. Surface this either in Deep
+  Research settings or as a Cookbook scan/dropdown suggestion.
+- Cookbook model scan/download ranking. Prioritize newer architectures and
+  better hardware-fit models instead of scoring everything almost the same.
+  Ranking should account for architecture age, quant format, VRAM/RAM fit,
+  backend support, vision/mmproj requirements, and likely serve reliability.
+- Cookbook error feedback and logging. Failed downloads, dependency installs,
+  preflights, and serve jobs should show the actual command/output/error in the
+  UI, with copyable logs and clear next steps instead of just "crashed".
+- Agent prompt/context bloat. Agent mode is too heavy for smaller local models:
+  tool schemas, skills, memory, documents, and instructions can eat the context
+  before the user request really starts. We need slimmer prompts, better tool
+  selection, smaller default tool sets, and clearer guidance for models with
+  4k/8k/16k context windows.
+- Skill/tool prompt-injection audit. User-editable skills, notes, documents,
+  fetched pages, and memories should be treated as untrusted data. Keep testing
+  whether models follow malicious instructions from those surfaces.
 - Better degraded-state reporting for ChromaDB, SearXNG, email, ntfy, and provider probes.
 - Provider setup/probing audit for Anthropic, Gemini, Groq, xAI, OpenRouter, OpenAI, and DeepSeek.

 ## Refactor Targets
 - CSS cleanup. `static/style.css` basically Calypso's island atm.
 - Tour core helper. The onboarding tours have too much copy-pasted scaffolding; promote a shared `tour-core.js` helper before adding more tours.
+- Modal/window positioning cleanup. Some window controls have improved, but the
+  underlying popup/dropdown/fixed-position behavior is still too fragile.
 - Mobile media override discoverability. A lot of "CSS did not move" bugs are mobile `@media` overrides of the same selector; comments or linting around desktop/mobile paired rules would help.
 - Dead code pass for old routes, stale feature flags, and unused UI states.

 ## Frontend

+- Expand the Editor for quicker, more robust everyday use. Better file/document
+  handling, smoother window behavior, clearer save/export flows, stronger image
+  editing affordances, and fewer brittle edge cases.
+- Better AI integration for Notes and Todos. Notes should be easier for the
+  agent to read, update, summarize, and turn into actions. Todos should be
+  assignable to an agent from the UI, possibly through a button, task action,
+  or dedicated skill/tool flow.
 - Mobile gallery/editor polish. Easier to launch/download inpaint model or any missing pieces.
 - Accessibility pass: keyboard navigation, focus states, contrast, reduced motion.
 - Improve empty states and error messages on fresh installs.
--- a/routes/cookbook_helpers.py
+++ b/routes/cookbook_helpers.py
@@ -434,6 +434,8 @@ def _parse_serve_phase(snapshot: str, task_type: str = "serve") -> dict:
        }
    if "Application startup complete" in flat:
        return {"phase": "ready", "status": "ready"}
+    if re.search(r'Ollama API ready on port\s+\d+', flat, re.I):
+        return {"phase": "ready", "status": "ready"}
    # HTTP access logs (e.g. GET /v1/models 200 OK) mean the server is up and serving
    if re.search(r'(?:GET|POST)\s+/[^\s]*\s+HTTP/[\d.]+"\s*\d{3}', flat):
        return {"phase": "idle", "status": "ready"}
--- a/routes/cookbook_routes.py
+++ b/routes/cookbook_routes.py
@@ -905,6 +905,7 @@ def setup_cookbook_routes() -> APIRouter:
            # Show whether the HF token reached this server (masked) — a gated
            # model vLLM has to download will be denied without it.
            runner_lines.append(_HF_TOKEN_STATUS_SNIPPET)
+            handled_ollama_serve = False
            # Auto-install inference engine if missing
            if "llama_cpp" in req.cmd or "llama-server" in req.cmd:
                # Prefer the NATIVE llama-server binary — its minja templating
@@ -978,17 +979,48 @@ def setup_cookbook_routes() -> APIRouter:
                runner_lines.append('  fi')
                runner_lines.append('fi')
            elif "ollama" in req.cmd:
-                # Ollama manages its own model store and HTTP server. Just make
-                # sure the binary exists and the daemon is up before running the
-                # command (the natural serving engine on Apple Silicon / Metal).
+                handled_ollama_serve = True
+                _ollama_port = "11434"
+                _ollama_match = re.search(r"OLLAMA_HOST=[^\s:]+:(\d+)", req.cmd)
+                if _ollama_match:
+                    _ollama_port = _ollama_match.group(1)
+                # Ollama can be a host binary, a system service, or a Docker
+                # container. If the HTTP API is already reachable, the model is
+                # already served and we should not require a host `ollama` CLI.
+                runner_lines.append(f'ODYSSEUS_OLLAMA_PORT="{_ollama_port}"')
+                runner_lines.append('ODYSSEUS_OLLAMA_URL=""')
+                runner_lines.append('for _ody_ollama_port in "$ODYSSEUS_OLLAMA_PORT" 11434; do')
+                runner_lines.append('  [ -z "$_ody_ollama_port" ] && continue')
+                runner_lines.append('  for _ody_ollama_host in 127.0.0.1 localhost host.docker.internal; do')
+                runner_lines.append('    _ody_ollama_url="http://${_ody_ollama_host}:${_ody_ollama_port}"')
+                runner_lines.append('    if curl -sf "$_ody_ollama_url/api/tags" >/dev/null 2>&1; then')
+                runner_lines.append('      ODYSSEUS_OLLAMA_URL="$_ody_ollama_url"')
+                runner_lines.append('      ODYSSEUS_OLLAMA_PORT="$_ody_ollama_port"')
+                runner_lines.append('      break 2')
+                runner_lines.append('    fi')
+                runner_lines.append('  done')
+                runner_lines.append('done')
+                runner_lines.append('if [ -n "$ODYSSEUS_OLLAMA_URL" ]; then')
+                runner_lines.append('  if [ "$ODYSSEUS_OLLAMA_PORT" != "' + _ollama_port + '" ]; then')
+                runner_lines.append('    echo "[odysseus] Selected Ollama port ' + _ollama_port + ' was not reachable; using running Ollama on port ${ODYSSEUS_OLLAMA_PORT}."')
+                runner_lines.append('  fi')
+                runner_lines.append('  echo "[odysseus] Ollama API ready on port ${ODYSSEUS_OLLAMA_PORT}: ${ODYSSEUS_OLLAMA_URL}"')
+                runner_lines.append('  echo "[odysseus] This task is monitoring an existing Ollama server; stopping it here will not stop an external Docker/system service."')
+                runner_lines.append('  exec bash -i')
+                runner_lines.append('fi')
                runner_lines.append('if ! command -v ollama &>/dev/null; then')
-                runner_lines.append('  echo "ERROR: Ollama not found. Install it (macOS: brew install ollama, or https://ollama.com/download), then launch again."')
-                runner_lines.append('  ODYSSEUS_PREFLIGHT_EXIT=127')
-                runner_lines.append('fi')
-                runner_lines.append('if ! curl -sf http://localhost:11434/api/tags >/dev/null 2>&1; then')
-                runner_lines.append('  echo "Starting ollama server..."; (ollama serve >/dev/null 2>&1 &)')
-                runner_lines.append('  for _ in 1 2 3 4 5 6 7 8 9 10; do curl -sf http://localhost:11434/api/tags >/dev/null 2>&1 && break; sleep 1; done')
+                runner_lines.append('  echo "ERROR: Ollama not found and no Ollama API is reachable on 127.0.0.1, localhost, or host.docker.internal (ports ${ODYSSEUS_OLLAMA_PORT}/11434)."')
+                runner_lines.append('  echo "Install Ollama, start an Ollama service/container on this server, or pick the port where it is already listening."')
+                runner_lines.append('  echo')
+                runner_lines.append('  echo "=== Process exited with code 127 ==="')
+                runner_lines.append('  exec bash -i')
                runner_lines.append('fi')
+                runner_lines.append('echo "Starting ollama server on 0.0.0.0:${ODYSSEUS_OLLAMA_PORT}..."')
+                runner_lines.append('OLLAMA_HOST="0.0.0.0:${ODYSSEUS_OLLAMA_PORT}" ollama serve')
+                runner_lines.append('_ody_exit=$?')
+                runner_lines.append('echo')
+                runner_lines.append('echo "=== Process exited with code ${_ody_exit} ==="')
+                runner_lines.append('exec bash -i')
            elif "vllm serve" in req.cmd:
                # vLLM is CUDA/ROCm-only and does not run on macOS at all.
                runner_lines.append('if [ "$(uname -s)" = "Darwin" ]; then')
@@ -1016,18 +1048,19 @@ def setup_cookbook_routes() -> APIRouter:
                runner_lines.append('  ODYSSEUS_PREFLIGHT_EXIT=127')
                runner_lines.append('fi')

-            _append_serve_preflight_exit_lines(
-                runner_lines,
-                keep_shell_open=not local_windows,
-            )
-            runner_lines.append(req.cmd)
-            if local_windows:
-                # Detached background process — no interactive shell to keep open.
-                # Print the exit marker the status poller looks for, then stop.
-                _append_serve_exit_code_lines(runner_lines, keep_shell_open=False)
-            else:
-                # Keep shell open after exit so user can see errors
-                _append_serve_exit_code_lines(runner_lines, keep_shell_open=True)
+            if not handled_ollama_serve:
+                _append_serve_preflight_exit_lines(
+                    runner_lines,
+                    keep_shell_open=not local_windows,
+                )
+                runner_lines.append(req.cmd)
+                if local_windows:
+                    # Detached background process — no interactive shell to keep open.
+                    # Print the exit marker the status poller looks for, then stop.
+                    _append_serve_exit_code_lines(runner_lines, keep_shell_open=False)
+                else:
+                    # Keep shell open after exit so user can see errors
+                    _append_serve_exit_code_lines(runner_lines, keep_shell_open=True)

            runner_path = TMUX_LOG_DIR / f"{session_id}_run.sh"
            runner_path.write_text("\n".join(runner_lines) + "\n", encoding="utf-8")
--- a/static/js/cookbookRunning.js
+++ b/static/js/cookbookRunning.js
@@ -169,6 +169,9 @@ export function _parseServePhase(snapshot) {
  if (flat.includes('Application startup complete')) {
    return { phase: 'ready', status: 'ready' };
  }
+  if (/Ollama API ready on port\s+\d+/i.test(flat)) {
+    return { phase: 'ready', status: 'ready' };
+  }
  // HTTP access logs (e.g. GET /v1/models 200 OK) mean the server is up
  if (/(?:GET|POST)\s+\/[^\s]*\s+HTTP\/[\d.]+"\s*\d{3}/.test(flat)) {
    return { phase: 'idle', status: 'ready' };
@@ -2295,15 +2298,24 @@ async function _reconnectTask(el, task) {
        if (task.type === 'serve' && !task._endpointAdded && !task._endpointAddInFlight && task._serveReady) {
          task._endpointAddInFlight = true;
          const rawHost = task.remoteHost || 'localhost';
-          const host = rawHost.includes('@') ? rawHost.split('@').pop() : rawHost;
+          let host = rawHost.includes('@') ? rawHost.split('@').pop() : rawHost;
          const portMatch = task.payload?._cmd?.match(/--port[=\s]+(\d+)/)
            || task.payload?._cmd?.match(/(?:^|\s)-p[=\s]+(\d+)/)
            || snapshot.match(/Uvicorn running on\D*?:(\d+)/i)
            || snapshot.match(/running on\D*?:(\d+)/i)
            || snapshot.match(/listening on\D*?:(\d+)/i)
            || snapshot.match(/port[:=\s]+(\d+)/i);
-          const port = portMatch ? portMatch[1] : '8000';
-          const baseUrl = `http://${host}:${port}/v1`;
+          let port = portMatch ? portMatch[1] : '8000';
+          let baseUrl = `http://${host}:${port}/v1`;
+          const ollamaUrlMatch = snapshot.match(/Ollama API ready on port\s+\d+:\s*(http:\/\/[^\s]+)/i);
+          if (ollamaUrlMatch) {
+            try {
+              const u = new URL(ollamaUrlMatch[1]);
+              host = u.hostname || host;
+              port = u.port || '11434';
+              baseUrl = `${u.origin}/v1`;
+            } catch {}
+          }
          fetch('/api/model-endpoints', { credentials: 'same-origin' })
            .then(r => r.json())
            .then(async (eps) => {
@@ -2642,10 +2654,21 @@ async function _pollBackgroundStatus() {
      if (localTask && localTask._endpointAdded) continue;

      const rawHost = localTask?.remoteHost || t.remote || 'localhost';
-      const host = rawHost.includes('@') ? rawHost.split('@').pop() : (rawHost === 'local' ? 'localhost' : rawHost);
-      const portMatch = localTask?.payload?._cmd?.match(/--port\s+(\d+)/);
-      const port = portMatch ? portMatch[1] : '8000';
-      const baseUrl = `http://${host}:${port}/v1`;
+      let host = rawHost.includes('@') ? rawHost.split('@').pop() : (rawHost === 'local' ? 'localhost' : rawHost);
+      const portMatch = localTask?.payload?._cmd?.match(/--port\s+(\d+)/)
+        || localTask?.payload?._cmd?.match(/OLLAMA_HOST=[^\s:]+:(\d+)/);
+      let port = portMatch ? portMatch[1] : '8000';
+      let baseUrl = `http://${host}:${port}/v1`;
+      const snapshot = t.output || localTask?.output || '';
+      const ollamaUrlMatch = snapshot.match(/Ollama API ready on port\s+\d+:\s*(http:\/\/[^\s]+)/i);
+      if (ollamaUrlMatch) {
+        try {
+          const u = new URL(ollamaUrlMatch[1]);
+          host = u.hostname || host;
+          port = u.port || '11434';
+          baseUrl = `${u.origin}/v1`;
+        } catch {}
+      }
      const _isDiffusion = localTask?.payload?._cmd?.includes('diffusion_server');

      _updateTask(t.session_id, { _serveReady: true, _endpointAdded: true });
--- a/static/js/cookbookServe.js
+++ b/static/js/cookbookServe.js
@@ -391,7 +391,8 @@ function _rerenderCachedModels() {
      panelHtml += `<label>${_l('Backend','Inference engine: vLLM, SGLang, llama.cpp, Ollama, or Diffusers')}<select class="hwfit-sf" data-field="backend">${backendOpts}</select></label>`;
      panelHtml += `<input type="hidden" class="hwfit-sf" data-field="host" value="${esc(_es.remoteHost || '')}" />`;
      panelHtml += `<label>${_l('venv','Path to Python venv or conda env activate script')}<input type="text" class="hwfit-sf hwfit-sf-wide" data-field="venv" value="${esc(sv('venv', _es.envPath || _srvVenv || ''))}" placeholder="~/venv" /></label>`;
-      panelHtml += `<label>${_l('Port','HTTP port for the API server')}<input type="text" class="hwfit-sf" data-field="port" value="${esc(sv('port', _nextAvailablePort()))}" /></label>`;
+      const defaultPort = defaultBackend === 'ollama' ? '11434' : _nextAvailablePort();
+      panelHtml += `<label>${_l('Port','HTTP port for the API server')}<input type="text" class="hwfit-sf" data-field="port" value="${esc(sv('port', defaultPort))}" /></label>`;
      const _activeGpus = (defaultGpus || '').split(',').map(s => s.trim()).filter(Boolean);
      const detectedGpuCount = Number(_getGpuToggleTotal?.() || 0);
      const _gpuMax = Math.max(detectedGpuCount || 8, ...(_activeGpus.map(Number).filter(n => !isNaN(n)).map(n => n + 1)));
--- a/static/js/slashAutocomplete.js
+++ b/static/js/slashAutocomplete.js
@@ -18,7 +18,7 @@ const EXCLUDED = new Set(['flip','roll','8ball','fortune','odyssey','ascii']);
 // are the short forms people will actually type (/new, /clear, /web, etc.)
 // rather than the full /chats new, /toggle web equivalents.
 const PROMOTED_ALIASES = new Set([
-  'new','clear','rename','fork','export','archive','important','star',
+  'new','clear','rename','fork','export','archive','favorite','unfavorite',
  'web','bash','research','doc',
  'memories','forget',
 ]);
--- a/static/js/slashCommands.js
+++ b/static/js/slashCommands.js
@@ -5393,8 +5393,8 @@ const COMMANDS = {
      'delete':      { handler: _cmdSessionDelete,      alias: ['del','rm'],       help: 'Delete chat',                 usage: '/chats delete [id]' },
      'archive':     { handler: _cmdSessionArchive,     alias: ['tar'],            help: 'Archive chat',                usage: '/chats archive [id]' },
      'rename':      { handler: _cmdSessionRename,      alias: ['mv'],             help: 'Rename current chat',         usage: '/chats rename Name' },
-      'important':   { handler: _cmdSessionImportant,   alias: ['pin'],            help: 'Mark as important',           usage: '/chats important' },
-      'unimportant': { handler: _cmdSessionUnimportant, alias: ['unpin'],          help: 'Unmark important',            usage: '/chats unimportant' },
+      'favorite':    { handler: _cmdSessionImportant,   alias: ['pin','important'], help: 'Mark as favorite',          usage: '/chats favorite' },
+      'unfavorite':  { handler: _cmdSessionUnimportant, alias: ['unpin','unimportant'], help: 'Unmark favorite',       usage: '/chats unfavorite' },
      'fork':        { handler: _cmdSessionFork,        alias: ['cp'],             help: 'Fork chat (keep first N msgs)', usage: '/chats fork [N]' },
      'truncate':    { handler: _cmdSessionTruncate,    alias: [],                 help: 'Delete older messages, keep last N', usage: '/chats truncate N' },
      'switch':      { handler: _cmdSessionSwitch,      alias: ['goto','cd'],      help: 'Switch to chat by name/id',    usage: '/chats switch name' },
@@ -5732,10 +5732,12 @@ export const LEGACY_ALIASES = {
  'del':         { parent: 'chats', sub: 'delete' },
  'archive':     { parent: 'chats', sub: 'archive' },
  'rename':      { parent: 'chats', sub: 'rename' },
-  'important':   { parent: 'chats', sub: 'important' },
-  'star':        { parent: 'chats', sub: 'important' },
-  'unimportant': { parent: 'chats', sub: 'unimportant' },
-  'unstar':      { parent: 'chats', sub: 'unimportant' },
+  'favorite':    { parent: 'chats', sub: 'favorite' },
+  'important':   { parent: 'chats', sub: 'favorite' },
+  'star':        { parent: 'chats', sub: 'favorite' },
+  'unfavorite':  { parent: 'chats', sub: 'unfavorite' },
+  'unimportant': { parent: 'chats', sub: 'unfavorite' },
+  'unstar':      { parent: 'chats', sub: 'unfavorite' },
  'fork':        { parent: 'chats', sub: 'fork' },
  'truncate':    { parent: 'chats', sub: 'truncate' },
  'sessions':    { parent: 'chats', sub: 'info' },
--- a/static/js/tasks.js
+++ b/static/js/tasks.js
@@ -349,10 +349,23 @@ function _taskIcon(task) {
  return `<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="opacity:0.4;flex-shrink:0;position:relative;top:-4px;">${path}</svg>`;
 }

+const _MODEL_BACKED_ACTIONS = new Set([
+  'summarize_emails',
+  'draft_email_replies',
+  'extract_email_events',
+  'classify_events',
+  'mark_email_boundaries',
+  'learn_sender_signatures',
+  'check_email_urgency',
+  'test_skills',
+  'audit_skills',
+  'consolidate_memory',
+]);
+
 function _taskAiMark(task) {
  const kind = task?.task_type || task?.kind || '';
  const action = task?.action || '';
-  const aiAction = /(^|_)(ai|summarize|summary|draft|reply|classify|triage|audit|research|brief|skills?)($|_)/i.test(action);
+  const aiAction = _MODEL_BACKED_ACTIONS.has(action);
  if (!(kind === 'llm' || kind === 'research' || task?.model || task?.endpointUrl || aiAction)) return '';
  return '<svg class="task-ai-mark" width="10" height="10" viewBox="0 0 24 24" fill="currentColor" aria-label="Uses model" title="Uses model"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>';
 }
@@ -708,7 +721,7 @@ function _renderList() {
      const runBtn = document.createElement('button');
      runBtn.className = 'task-status-badge task-run-now-badge task-card-run-btn';
      runBtn.title = 'Run now';
-      runBtn.style.cssText = 'position:relative;top:4px;margin-right:4px;';
+      runBtn.style.cssText = 'position:relative;top:1px;margin-right:4px;';
      runBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.2" stroke-linecap="round" stroke-linejoin="round"><polyline points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg><span>Run</span>';
      runBtn.addEventListener('click', (e) => { e.stopPropagation(); _doRunNow(task.id); });
      actionsWrap.insertBefore(runBtn, menuBtn);
--- a/static/style.css
+++ b/static/style.css
@@ -10203,6 +10203,12 @@ textarea.memory-add-input {
  height: 20px;
  min-height: 0;
  box-sizing: border-box;
+  position: relative;
+  top: -4px;
+}
+.task-state-badge svg {
+  position: relative;
+  top: -1px;
 }
 .task-status-badge:hover {
  filter: brightness(1.08) saturate(1.15);
@@ -21253,6 +21259,7 @@ a.chat-link[href^="#research-"] {
  }
  .task-card .task-card-run-btn {
    margin-right: 1px !important;
+    top: 0;
  }
 }

@@ -34765,7 +34772,7 @@ body.theme-frosted .modal {
 .slash-autocomplete-popup {
  position: fixed;
  z-index: 9000;
-  background: var(--bg-elev-2, #1a1a1a);
+  background: var(--panel, var(--bg));
  border: 1px solid var(--border, rgba(255,255,255,0.08));
  border-radius: 8px;
  box-shadow: 0 8px 24px rgba(0,0,0,0.35);
@@ -34793,8 +34800,8 @@ body.theme-frosted .modal {
  white-space: nowrap;
  overflow: hidden;
 }
-.slash-ac-row:hover { background: color-mix(in srgb, var(--fg) 6%, transparent); }
-.slash-ac-row-sel  { background: color-mix(in srgb, var(--accent, var(--red)) 14%, transparent); }
+.slash-ac-row:hover { background-color: color-mix(in srgb, var(--accent, var(--red)) 10%, transparent); }
+.slash-ac-row-sel  { background-color: color-mix(in srgb, var(--accent, var(--red)) 14%, transparent); }
 .slash-ac-token {
  font-family: 'Fira Code', ui-monospace, monospace;
  color: var(--accent, var(--red));