MrSphay/odysseus - odysseus - Gitea: Git with a cup of tea

MrSphay/odysseus

Author	SHA1	Message	Date
nsgds	5645cce6d0	Support vLLM 0.20.2 / NIM reasoning-parser output end-to-end (surface + agent context + render) (#602 ) * fix(stream): read 'reasoning' SSE field for vLLM 0.20.2 / NIM vLLM 0.20.2 / NVIDIA NIM emit reasoning-parser output in the `reasoning` delta field; older builds use `reasoning_content`. stream_llm() read only the latter, so reasoning from models like Nemotron-3-Nano (--reasoning-parser) was silently dropped and never rendered. Accept either field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): keep reasoning_content only on the latest assistant turn The agent loop echoed each round's reasoning back as `reasoning_content` on every assistant turn, assuming vendors ignore it. Nemotron's chat template re-injects ALL prior reasoning_content as <think> blocks, and the loop is trimmed only once (before it starts) — so reasoning accumulated unbounded across rounds, bloating context and feeding the model its own prior reasoning, which reinforced repetition/looping. Strip reasoning_content from earlier assistant turns so only the most recent round carries it (still satisfies DeepSeek's thinking-mode follow-up requirement). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent-ui): wrap each round's reasoning in its own <think> block The streamed think-tag wrapper gated on whole-message substring checks (accumulated.includes('<think>')), which only ever wrapped ONE reasoning block per message. A multi-round agent response has a reasoning phase per round, so once round 1 closed its <think>...</think>, rounds 2+ reasoning was emitted unwrapped and leaked into the visible answer. Replace the substring checks with a stateful open/close flag that toggles per think/answer cycle, so each round's reasoning gets its own collapsible block. Single-turn chat is unchanged (one open, one close). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(stream): reasoning/reasoning_content delta surfaces as thinking chunk Covers @pewdiepie-archdaemon's requested regression: a streamed {reasoning: ...} delta emits a thinking chunk while {content: ...} streams as normal content; plus the older reasoning_content field for backward compat. Mirrors the #591 scenario. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 11:48:17 +09:00
James Arslan	6776c7d691	Surface silent model fallback instead of masking it (#868 ) When the selected model fails before producing output, stream_llm_with_fallback quietly switches to the next candidate and the reply is shown under the originally selected model's name, so a misconfigured provider looks like it works. (Concretely: a Bedrock gateway that 400s every Anthropic/Claude request appears fine because another model silently answers under the Claude label.) Emit a `fallback` SSE event ({selected_model, answered_by, reason}) the first time a non-primary candidate produces output, forward it through the agent loop and both chat-route paths, stamp the response metrics with the model that actually answered, and show a notice + relabel the reply in the UI. Tested: python -m pytest tests/test_llm_core_fallback.py (3 pass); python -m py_compile src/llm_core.py src/agent_loop.py routes/chat_routes.py; node --check static/js/chat.js.	2026-06-02 11:37:25 +09:00
pewdiepie-archdaemon	664acf73ee	Merge branch 'pr-469' into visual-pr-playground	2026-06-02 06:26:31 +09:00
red person	e1102585bf	Fix chat stream recovery and PDF library indexing (#468 )	2026-06-01 22:33:35 +09:00
Sirsyorrz	6a2f0d5904	Add slash command autocomplete popup Typing / in the chat composer now shows a filtered popup listing all available commands with their description. Arrow keys or Tab to select, Enter/Tab to insert, Esc to close, click also works. - New module: static/js/slashAutocomplete.js Reads the existing COMMANDS registry (and LEGACY_ALIASES) from slashCommands.js — no command logic added here, just discovery UI. Excludes easter-egg commands (flip, roll, 8ball, fortune, odyssey, ascii). Promotes short legacy aliases (/new, /clear, /web, /compact, /research, etc.) as first-class rows so users don't have to know the full /session new form. - slashCommands.js: export COMMANDS and LEGACY_ALIASES so the new module can read the registry. - chat.js: lazy-import slashAutocomplete on init, wire to #message textarea. - style.css: popup + row styles using existing CSS variables.	2026-06-01 21:33:46 +10:00
pewdiepie-archdaemon	be260f43e8	Handle incomplete detached agent streams	2026-06-01 16:54:11 +09:00
pewdiepie-archdaemon	fc7f107b22	Improve Ollama setup and model endpoint handling	2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00