Polish task UI slash commands and Ollama serving

This commit is contained in:
pewdiepie-archdaemon
2026-06-02 09:36:03 +09:00
parent ab0a480f30
commit 96618b01c0
9 changed files with 155 additions and 45 deletions

View File

@@ -8,25 +8,54 @@ the codebase, you are probably right to stay away.
## High Priority
- SQUASH BUGS
- Fresh Docker install smoke tests on Linux, macOS, and Windows!!
- Fresh install smoke tests on Linux, macOS, and Windows. Docker, native Python,
and WSL all need coverage.
- Integration audit: do integrations even work? Confirm what works, what needs setup docs, and what should be removed or hidden.
- Self-host troubleshooting cookbook. Document the weird 30-second fixes that otherwise become 30-minute searches: Dovecot cleartext auth for local stacks, ntfy Android Instant Delivery for non-ntfy.sh servers, clipboard limits on plain-HTTP Tailscale URLs, Radicale collection URLs, and similar traps.
- Cookbook reliability on other computers. This is probably the area most likely to need work across different machines, GPUs, drivers, shells, and Python environments.
- Tile/window management correctness. I had to brute force my way a bit here, I'm aware, popups, dropdowns, and fixed-position UI inside transformed modals can land in the wrong place.
- Esc button, it's small but a lot of windows that arent still close on esc and alot of them doesnt.
- Skill audit, how does your model respond to skill injection, does it follow? Does its parsing miss?
- Cookbook SGLang support across platforms. Make sure SGLang setup/serve works
predictably on Linux, Windows/WSL, macOS where possible, Docker, and common
NVIDIA/AMD hardware paths.
- Deep Research model presets by hardware. Recommend approved model/parameter
profiles for small, medium, and large local setups so people with different
hardware can use Deep Research without guessing. Surface this either in Deep
Research settings or as a Cookbook scan/dropdown suggestion.
- Cookbook model scan/download ranking. Prioritize newer architectures and
better hardware-fit models instead of scoring everything almost the same.
Ranking should account for architecture age, quant format, VRAM/RAM fit,
backend support, vision/mmproj requirements, and likely serve reliability.
- Cookbook error feedback and logging. Failed downloads, dependency installs,
preflights, and serve jobs should show the actual command/output/error in the
UI, with copyable logs and clear next steps instead of just "crashed".
- Agent prompt/context bloat. Agent mode is too heavy for smaller local models:
tool schemas, skills, memory, documents, and instructions can eat the context
before the user request really starts. We need slimmer prompts, better tool
selection, smaller default tool sets, and clearer guidance for models with
4k/8k/16k context windows.
- Skill/tool prompt-injection audit. User-editable skills, notes, documents,
fetched pages, and memories should be treated as untrusted data. Keep testing
whether models follow malicious instructions from those surfaces.
- Better degraded-state reporting for ChromaDB, SearXNG, email, ntfy, and provider probes.
- Provider setup/probing audit for Anthropic, Gemini, Groq, xAI, OpenRouter, OpenAI, and DeepSeek.
## Refactor Targets
- CSS cleanup. `static/style.css` basically Calypso's island atm.
- Tour core helper. The onboarding tours have too much copy-pasted scaffolding; promote a shared `tour-core.js` helper before adding more tours.
- Modal/window positioning cleanup. Some window controls have improved, but the
underlying popup/dropdown/fixed-position behavior is still too fragile.
- Mobile media override discoverability. A lot of "CSS did not move" bugs are mobile `@media` overrides of the same selector; comments or linting around desktop/mobile paired rules would help.
- Dead code pass for old routes, stale feature flags, and unused UI states.
## Frontend
- Expand the Editor for quicker, more robust everyday use. Better file/document
handling, smoother window behavior, clearer save/export flows, stronger image
editing affordances, and fewer brittle edge cases.
- Better AI integration for Notes and Todos. Notes should be easier for the
agent to read, update, summarize, and turn into actions. Todos should be
assignable to an agent from the UI, possibly through a button, task action,
or dedicated skill/tool flow.
- Mobile gallery/editor polish. Easier to launch/download inpaint model or any missing pieces.
- Accessibility pass: keyboard navigation, focus states, contrast, reduced motion.
- Improve empty states and error messages on fresh installs.

View File

@@ -434,6 +434,8 @@ def _parse_serve_phase(snapshot: str, task_type: str = "serve") -> dict:
}
if "Application startup complete" in flat:
return {"phase": "ready", "status": "ready"}
if re.search(r'Ollama API ready on port\s+\d+', flat, re.I):
return {"phase": "ready", "status": "ready"}
# HTTP access logs (e.g. GET /v1/models 200 OK) mean the server is up and serving
if re.search(r'(?:GET|POST)\s+/[^\s]*\s+HTTP/[\d.]+"\s*\d{3}', flat):
return {"phase": "idle", "status": "ready"}

View File

@@ -905,6 +905,7 @@ def setup_cookbook_routes() -> APIRouter:
# Show whether the HF token reached this server (masked) — a gated
# model vLLM has to download will be denied without it.
runner_lines.append(_HF_TOKEN_STATUS_SNIPPET)
handled_ollama_serve = False
# Auto-install inference engine if missing
if "llama_cpp" in req.cmd or "llama-server" in req.cmd:
# Prefer the NATIVE llama-server binary — its minja templating
@@ -978,17 +979,48 @@ def setup_cookbook_routes() -> APIRouter:
runner_lines.append(' fi')
runner_lines.append('fi')
elif "ollama" in req.cmd:
# Ollama manages its own model store and HTTP server. Just make
# sure the binary exists and the daemon is up before running the
# command (the natural serving engine on Apple Silicon / Metal).
handled_ollama_serve = True
_ollama_port = "11434"
_ollama_match = re.search(r"OLLAMA_HOST=[^\s:]+:(\d+)", req.cmd)
if _ollama_match:
_ollama_port = _ollama_match.group(1)
# Ollama can be a host binary, a system service, or a Docker
# container. If the HTTP API is already reachable, the model is
# already served and we should not require a host `ollama` CLI.
runner_lines.append(f'ODYSSEUS_OLLAMA_PORT="{_ollama_port}"')
runner_lines.append('ODYSSEUS_OLLAMA_URL=""')
runner_lines.append('for _ody_ollama_port in "$ODYSSEUS_OLLAMA_PORT" 11434; do')
runner_lines.append(' [ -z "$_ody_ollama_port" ] && continue')
runner_lines.append(' for _ody_ollama_host in 127.0.0.1 localhost host.docker.internal; do')
runner_lines.append(' _ody_ollama_url="http://${_ody_ollama_host}:${_ody_ollama_port}"')
runner_lines.append(' if curl -sf "$_ody_ollama_url/api/tags" >/dev/null 2>&1; then')
runner_lines.append(' ODYSSEUS_OLLAMA_URL="$_ody_ollama_url"')
runner_lines.append(' ODYSSEUS_OLLAMA_PORT="$_ody_ollama_port"')
runner_lines.append(' break 2')
runner_lines.append(' fi')
runner_lines.append(' done')
runner_lines.append('done')
runner_lines.append('if [ -n "$ODYSSEUS_OLLAMA_URL" ]; then')
runner_lines.append(' if [ "$ODYSSEUS_OLLAMA_PORT" != "' + _ollama_port + '" ]; then')
runner_lines.append(' echo "[odysseus] Selected Ollama port ' + _ollama_port + ' was not reachable; using running Ollama on port ${ODYSSEUS_OLLAMA_PORT}."')
runner_lines.append(' fi')
runner_lines.append(' echo "[odysseus] Ollama API ready on port ${ODYSSEUS_OLLAMA_PORT}: ${ODYSSEUS_OLLAMA_URL}"')
runner_lines.append(' echo "[odysseus] This task is monitoring an existing Ollama server; stopping it here will not stop an external Docker/system service."')
runner_lines.append(' exec bash -i')
runner_lines.append('fi')
runner_lines.append('if ! command -v ollama &>/dev/null; then')
runner_lines.append(' echo "ERROR: Ollama not found. Install it (macOS: brew install ollama, or https://ollama.com/download), then launch again."')
runner_lines.append(' ODYSSEUS_PREFLIGHT_EXIT=127')
runner_lines.append('fi')
runner_lines.append('if ! curl -sf http://localhost:11434/api/tags >/dev/null 2>&1; then')
runner_lines.append(' echo "Starting ollama server..."; (ollama serve >/dev/null 2>&1 &)')
runner_lines.append(' for _ in 1 2 3 4 5 6 7 8 9 10; do curl -sf http://localhost:11434/api/tags >/dev/null 2>&1 && break; sleep 1; done')
runner_lines.append(' echo "ERROR: Ollama not found and no Ollama API is reachable on 127.0.0.1, localhost, or host.docker.internal (ports ${ODYSSEUS_OLLAMA_PORT}/11434)."')
runner_lines.append(' echo "Install Ollama, start an Ollama service/container on this server, or pick the port where it is already listening."')
runner_lines.append(' echo')
runner_lines.append(' echo "=== Process exited with code 127 ==="')
runner_lines.append(' exec bash -i')
runner_lines.append('fi')
runner_lines.append('echo "Starting ollama server on 0.0.0.0:${ODYSSEUS_OLLAMA_PORT}..."')
runner_lines.append('OLLAMA_HOST="0.0.0.0:${ODYSSEUS_OLLAMA_PORT}" ollama serve')
runner_lines.append('_ody_exit=$?')
runner_lines.append('echo')
runner_lines.append('echo "=== Process exited with code ${_ody_exit} ==="')
runner_lines.append('exec bash -i')
elif "vllm serve" in req.cmd:
# vLLM is CUDA/ROCm-only and does not run on macOS at all.
runner_lines.append('if [ "$(uname -s)" = "Darwin" ]; then')
@@ -1016,18 +1048,19 @@ def setup_cookbook_routes() -> APIRouter:
runner_lines.append(' ODYSSEUS_PREFLIGHT_EXIT=127')
runner_lines.append('fi')
_append_serve_preflight_exit_lines(
runner_lines,
keep_shell_open=not local_windows,
)
runner_lines.append(req.cmd)
if local_windows:
# Detached background process — no interactive shell to keep open.
# Print the exit marker the status poller looks for, then stop.
_append_serve_exit_code_lines(runner_lines, keep_shell_open=False)
else:
# Keep shell open after exit so user can see errors
_append_serve_exit_code_lines(runner_lines, keep_shell_open=True)
if not handled_ollama_serve:
_append_serve_preflight_exit_lines(
runner_lines,
keep_shell_open=not local_windows,
)
runner_lines.append(req.cmd)
if local_windows:
# Detached background process — no interactive shell to keep open.
# Print the exit marker the status poller looks for, then stop.
_append_serve_exit_code_lines(runner_lines, keep_shell_open=False)
else:
# Keep shell open after exit so user can see errors
_append_serve_exit_code_lines(runner_lines, keep_shell_open=True)
runner_path = TMUX_LOG_DIR / f"{session_id}_run.sh"
runner_path.write_text("\n".join(runner_lines) + "\n", encoding="utf-8")

View File

@@ -169,6 +169,9 @@ export function _parseServePhase(snapshot) {
if (flat.includes('Application startup complete')) {
return { phase: 'ready', status: 'ready' };
}
if (/Ollama API ready on port\s+\d+/i.test(flat)) {
return { phase: 'ready', status: 'ready' };
}
// HTTP access logs (e.g. GET /v1/models 200 OK) mean the server is up
if (/(?:GET|POST)\s+\/[^\s]*\s+HTTP\/[\d.]+"\s*\d{3}/.test(flat)) {
return { phase: 'idle', status: 'ready' };
@@ -2295,15 +2298,24 @@ async function _reconnectTask(el, task) {
if (task.type === 'serve' && !task._endpointAdded && !task._endpointAddInFlight && task._serveReady) {
task._endpointAddInFlight = true;
const rawHost = task.remoteHost || 'localhost';
const host = rawHost.includes('@') ? rawHost.split('@').pop() : rawHost;
let host = rawHost.includes('@') ? rawHost.split('@').pop() : rawHost;
const portMatch = task.payload?._cmd?.match(/--port[=\s]+(\d+)/)
|| task.payload?._cmd?.match(/(?:^|\s)-p[=\s]+(\d+)/)
|| snapshot.match(/Uvicorn running on\D*?:(\d+)/i)
|| snapshot.match(/running on\D*?:(\d+)/i)
|| snapshot.match(/listening on\D*?:(\d+)/i)
|| snapshot.match(/port[:=\s]+(\d+)/i);
const port = portMatch ? portMatch[1] : '8000';
const baseUrl = `http://${host}:${port}/v1`;
let port = portMatch ? portMatch[1] : '8000';
let baseUrl = `http://${host}:${port}/v1`;
const ollamaUrlMatch = snapshot.match(/Ollama API ready on port\s+\d+:\s*(http:\/\/[^\s]+)/i);
if (ollamaUrlMatch) {
try {
const u = new URL(ollamaUrlMatch[1]);
host = u.hostname || host;
port = u.port || '11434';
baseUrl = `${u.origin}/v1`;
} catch {}
}
fetch('/api/model-endpoints', { credentials: 'same-origin' })
.then(r => r.json())
.then(async (eps) => {
@@ -2642,10 +2654,21 @@ async function _pollBackgroundStatus() {
if (localTask && localTask._endpointAdded) continue;
const rawHost = localTask?.remoteHost || t.remote || 'localhost';
const host = rawHost.includes('@') ? rawHost.split('@').pop() : (rawHost === 'local' ? 'localhost' : rawHost);
const portMatch = localTask?.payload?._cmd?.match(/--port\s+(\d+)/);
const port = portMatch ? portMatch[1] : '8000';
const baseUrl = `http://${host}:${port}/v1`;
let host = rawHost.includes('@') ? rawHost.split('@').pop() : (rawHost === 'local' ? 'localhost' : rawHost);
const portMatch = localTask?.payload?._cmd?.match(/--port\s+(\d+)/)
|| localTask?.payload?._cmd?.match(/OLLAMA_HOST=[^\s:]+:(\d+)/);
let port = portMatch ? portMatch[1] : '8000';
let baseUrl = `http://${host}:${port}/v1`;
const snapshot = t.output || localTask?.output || '';
const ollamaUrlMatch = snapshot.match(/Ollama API ready on port\s+\d+:\s*(http:\/\/[^\s]+)/i);
if (ollamaUrlMatch) {
try {
const u = new URL(ollamaUrlMatch[1]);
host = u.hostname || host;
port = u.port || '11434';
baseUrl = `${u.origin}/v1`;
} catch {}
}
const _isDiffusion = localTask?.payload?._cmd?.includes('diffusion_server');
_updateTask(t.session_id, { _serveReady: true, _endpointAdded: true });

View File

@@ -391,7 +391,8 @@ function _rerenderCachedModels() {
panelHtml += `<label>${_l('Backend','Inference engine: vLLM, SGLang, llama.cpp, Ollama, or Diffusers')}<select class="hwfit-sf" data-field="backend">${backendOpts}</select></label>`;
panelHtml += `<input type="hidden" class="hwfit-sf" data-field="host" value="${esc(_es.remoteHost || '')}" />`;
panelHtml += `<label>${_l('venv','Path to Python venv or conda env activate script')}<input type="text" class="hwfit-sf hwfit-sf-wide" data-field="venv" value="${esc(sv('venv', _es.envPath || _srvVenv || ''))}" placeholder="~/venv" /></label>`;
panelHtml += `<label>${_l('Port','HTTP port for the API server')}<input type="text" class="hwfit-sf" data-field="port" value="${esc(sv('port', _nextAvailablePort()))}" /></label>`;
const defaultPort = defaultBackend === 'ollama' ? '11434' : _nextAvailablePort();
panelHtml += `<label>${_l('Port','HTTP port for the API server')}<input type="text" class="hwfit-sf" data-field="port" value="${esc(sv('port', defaultPort))}" /></label>`;
const _activeGpus = (defaultGpus || '').split(',').map(s => s.trim()).filter(Boolean);
const detectedGpuCount = Number(_getGpuToggleTotal?.() || 0);
const _gpuMax = Math.max(detectedGpuCount || 8, ...(_activeGpus.map(Number).filter(n => !isNaN(n)).map(n => n + 1)));

View File

@@ -18,7 +18,7 @@ const EXCLUDED = new Set(['flip','roll','8ball','fortune','odyssey','ascii']);
// are the short forms people will actually type (/new, /clear, /web, etc.)
// rather than the full /chats new, /toggle web equivalents.
const PROMOTED_ALIASES = new Set([
'new','clear','rename','fork','export','archive','important','star',
'new','clear','rename','fork','export','archive','favorite','unfavorite',
'web','bash','research','doc',
'memories','forget',
]);

View File

@@ -5393,8 +5393,8 @@ const COMMANDS = {
'delete': { handler: _cmdSessionDelete, alias: ['del','rm'], help: 'Delete chat', usage: '/chats delete [id]' },
'archive': { handler: _cmdSessionArchive, alias: ['tar'], help: 'Archive chat', usage: '/chats archive [id]' },
'rename': { handler: _cmdSessionRename, alias: ['mv'], help: 'Rename current chat', usage: '/chats rename Name' },
'important': { handler: _cmdSessionImportant, alias: ['pin'], help: 'Mark as important', usage: '/chats important' },
'unimportant': { handler: _cmdSessionUnimportant, alias: ['unpin'], help: 'Unmark important', usage: '/chats unimportant' },
'favorite': { handler: _cmdSessionImportant, alias: ['pin','important'], help: 'Mark as favorite', usage: '/chats favorite' },
'unfavorite': { handler: _cmdSessionUnimportant, alias: ['unpin','unimportant'], help: 'Unmark favorite', usage: '/chats unfavorite' },
'fork': { handler: _cmdSessionFork, alias: ['cp'], help: 'Fork chat (keep first N msgs)', usage: '/chats fork [N]' },
'truncate': { handler: _cmdSessionTruncate, alias: [], help: 'Delete older messages, keep last N', usage: '/chats truncate N' },
'switch': { handler: _cmdSessionSwitch, alias: ['goto','cd'], help: 'Switch to chat by name/id', usage: '/chats switch name' },
@@ -5732,10 +5732,12 @@ export const LEGACY_ALIASES = {
'del': { parent: 'chats', sub: 'delete' },
'archive': { parent: 'chats', sub: 'archive' },
'rename': { parent: 'chats', sub: 'rename' },
'important': { parent: 'chats', sub: 'important' },
'star': { parent: 'chats', sub: 'important' },
'unimportant': { parent: 'chats', sub: 'unimportant' },
'unstar': { parent: 'chats', sub: 'unimportant' },
'favorite': { parent: 'chats', sub: 'favorite' },
'important': { parent: 'chats', sub: 'favorite' },
'star': { parent: 'chats', sub: 'favorite' },
'unfavorite': { parent: 'chats', sub: 'unfavorite' },
'unimportant': { parent: 'chats', sub: 'unfavorite' },
'unstar': { parent: 'chats', sub: 'unfavorite' },
'fork': { parent: 'chats', sub: 'fork' },
'truncate': { parent: 'chats', sub: 'truncate' },
'sessions': { parent: 'chats', sub: 'info' },

View File

@@ -349,10 +349,23 @@ function _taskIcon(task) {
return `<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="opacity:0.4;flex-shrink:0;position:relative;top:-4px;">${path}</svg>`;
}
const _MODEL_BACKED_ACTIONS = new Set([
'summarize_emails',
'draft_email_replies',
'extract_email_events',
'classify_events',
'mark_email_boundaries',
'learn_sender_signatures',
'check_email_urgency',
'test_skills',
'audit_skills',
'consolidate_memory',
]);
function _taskAiMark(task) {
const kind = task?.task_type || task?.kind || '';
const action = task?.action || '';
const aiAction = /(^|_)(ai|summarize|summary|draft|reply|classify|triage|audit|research|brief|skills?)($|_)/i.test(action);
const aiAction = _MODEL_BACKED_ACTIONS.has(action);
if (!(kind === 'llm' || kind === 'research' || task?.model || task?.endpointUrl || aiAction)) return '';
return '<svg class="task-ai-mark" width="10" height="10" viewBox="0 0 24 24" fill="currentColor" aria-label="Uses model" title="Uses model"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>';
}
@@ -708,7 +721,7 @@ function _renderList() {
const runBtn = document.createElement('button');
runBtn.className = 'task-status-badge task-run-now-badge task-card-run-btn';
runBtn.title = 'Run now';
runBtn.style.cssText = 'position:relative;top:4px;margin-right:4px;';
runBtn.style.cssText = 'position:relative;top:1px;margin-right:4px;';
runBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.2" stroke-linecap="round" stroke-linejoin="round"><polyline points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg><span>Run</span>';
runBtn.addEventListener('click', (e) => { e.stopPropagation(); _doRunNow(task.id); });
actionsWrap.insertBefore(runBtn, menuBtn);

View File

@@ -10203,6 +10203,12 @@ textarea.memory-add-input {
height: 20px;
min-height: 0;
box-sizing: border-box;
position: relative;
top: -4px;
}
.task-state-badge svg {
position: relative;
top: -1px;
}
.task-status-badge:hover {
filter: brightness(1.08) saturate(1.15);
@@ -21253,6 +21259,7 @@ a.chat-link[href^="#research-"] {
}
.task-card .task-card-run-btn {
margin-right: 1px !important;
top: 0;
}
}
@@ -34765,7 +34772,7 @@ body.theme-frosted .modal {
.slash-autocomplete-popup {
position: fixed;
z-index: 9000;
background: var(--bg-elev-2, #1a1a1a);
background: var(--panel, var(--bg));
border: 1px solid var(--border, rgba(255,255,255,0.08));
border-radius: 8px;
box-shadow: 0 8px 24px rgba(0,0,0,0.35);
@@ -34793,8 +34800,8 @@ body.theme-frosted .modal {
white-space: nowrap;
overflow: hidden;
}
.slash-ac-row:hover { background: color-mix(in srgb, var(--fg) 6%, transparent); }
.slash-ac-row-sel { background: color-mix(in srgb, var(--accent, var(--red)) 14%, transparent); }
.slash-ac-row:hover { background-color: color-mix(in srgb, var(--accent, var(--red)) 10%, transparent); }
.slash-ac-row-sel { background-color: color-mix(in srgb, var(--accent, var(--red)) 14%, transparent); }
.slash-ac-token {
font-family: 'Fira Code', ui-monospace, monospace;
color: var(--accent, var(--red));