When the selected model fails before producing output, stream_llm_with_fallback
quietly switches to the next candidate and the reply is shown under the
originally selected model's name, so a misconfigured provider looks like it
works. (Concretely: a Bedrock gateway that 400s every Anthropic/Claude request
appears fine because another model silently answers under the Claude label.)
Emit a `fallback` SSE event ({selected_model, answered_by, reason}) the first
time a non-primary candidate produces output, forward it through the agent loop
and both chat-route paths, stamp the response metrics with the model that
actually answered, and show a notice + relabel the reply in the UI.
Tested: python -m pytest tests/test_llm_core_fallback.py (3 pass);
python -m py_compile src/llm_core.py src/agent_loop.py routes/chat_routes.py;
node --check static/js/chat.js.
Typing / in the chat composer now shows a filtered popup listing all
available commands with their description. Arrow keys or Tab to select,
Enter/Tab to insert, Esc to close, click also works.
- New module: static/js/slashAutocomplete.js
Reads the existing COMMANDS registry (and LEGACY_ALIASES) from
slashCommands.js — no command logic added here, just discovery UI.
Excludes easter-egg commands (flip, roll, 8ball, fortune, odyssey,
ascii). Promotes short legacy aliases (/new, /clear, /web, /compact,
/research, etc.) as first-class rows so users don't have to know the
full /session new form.
- slashCommands.js: export COMMANDS and LEGACY_ALIASES so the new
module can read the registry.
- chat.js: lazy-import slashAutocomplete on init, wire to #message
textarea.
- style.css: popup + row styles using existing CSS variables.