odysseus

MrSphay/odysseus

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ernest Hysa	a8a34bd22a	Ollama: pass discovered num_ctx in chat requests _build_ollama_payload sends options.temperature and options.num_predict to /api/chat, but never options.num_ctx. Ollama defaults num_ctx to 2048 when the option is omitted, so prompts going to any Ollama backend are silently truncated there regardless of the model's actual capability. Thread the discovered context length through the three call sites (llm_call, llm_call_async, stream_llm) and emit options.num_ctx when it is known and positive. The builder filters out the DEFAULT_CONTEXT fallback (128000) so we don't lie to Ollama about models whose window we couldn't actually discover. The issue's literal 'when > 2048' heuristic is dropped: a model with a real context smaller than 2048 would OOM if Ollama used its default, so we pass the real value regardless of size. Matches how src/context_compactor.py uses the same helper. Sister fix to PR #753 — that PR teaches the compactor the right budget, this one tells Ollama to actually use that budget on the way in.	2026-06-02 20:27:24 +09:00
James Arslan	a327df6936	Fix native tool-calling follow-up round on Gemini and Ollama (#867 ) The agent's multi-round (tool-result) follow-up request was rejected with HTTP 400 on two providers, so tools ran but the agent never produced an answer: - OpenAI-compatible streaming (Gemini 3) dropped the per-call thought_signature and collided parallel tool calls, which arrive with index=None: they all landed in slot 0, overwriting the first call's name and corrupting its arguments by concatenation, so the follow-up request 400'd. Capture and replay each call's extra_content (thought_signature), and give every parallel call its own accumulator slot (allocated above the max key, so sparse or mixed indices can't collide). - Native Ollama /api/chat expects object tool-call arguments, but Odysseus carries them as a JSON string, which Ollama rejected ("Value looks like object, but can't find closing '}' symbol"). Convert them to objects in the Ollama payload builder. Both compose with the no-prose null-content sanitize fix from #862. Tested: python -m pytest tests/test_llm_core_streaming.py tests/test_llm_core_ollama.py tests/test_agent_loop.py (53 pass), and python -m py_compile src/llm_core.py src/agent_loop.py.	2026-06-02 11:39:40 +09:00
Alexander Kenley	2c4b8b57dd	feat(ai): add OpenRouter and Ollama Cloud providers (#231 ) Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>	2026-06-01 14:26:10 +09:00

Author

SHA1

Message

Date

Ernest Hysa

a8a34bd22a

Ollama: pass discovered num_ctx in chat requests

_build_ollama_payload sends options.temperature and options.num_predict
to /api/chat, but never options.num_ctx. Ollama defaults num_ctx to 2048
when the option is omitted, so prompts going to any Ollama backend are
silently truncated there regardless of the model's actual capability.

Thread the discovered context length through the three call sites
(llm_call, llm_call_async, stream_llm) and emit options.num_ctx when it
is known and positive. The builder filters out the DEFAULT_CONTEXT
fallback (128000) so we don't lie to Ollama about models whose window
we couldn't actually discover. The issue's literal 'when > 2048'
heuristic is dropped: a model with a real context smaller than 2048
would OOM if Ollama used its default, so we pass the real value
regardless of size. Matches how src/context_compactor.py uses the
same helper.

Sister fix to PR #753 — that PR teaches the compactor the right budget,
this one tells Ollama to actually use that budget on the way in.

2026-06-02 20:27:24 +09:00

James Arslan

a327df6936

Fix native tool-calling follow-up round on Gemini and Ollama (#867 )

The agent's multi-round (tool-result) follow-up request was rejected with
HTTP 400 on two providers, so tools ran but the agent never produced an answer:

- OpenAI-compatible streaming (Gemini 3) dropped the per-call thought_signature
  and collided parallel tool calls, which arrive with index=None: they all
  landed in slot 0, overwriting the first call's name and corrupting its
  arguments by concatenation, so the follow-up request 400'd. Capture and replay
  each call's extra_content (thought_signature), and give every parallel call
  its own accumulator slot (allocated above the max key, so sparse or mixed
  indices can't collide).
- Native Ollama /api/chat expects object tool-call arguments, but Odysseus
  carries them as a JSON string, which Ollama rejected ("Value looks like
  object, but can't find closing '}' symbol"). Convert them to objects in the
  Ollama payload builder.

Both compose with the no-prose null-content sanitize fix from #862.

Tested: python -m pytest tests/test_llm_core_streaming.py
tests/test_llm_core_ollama.py tests/test_agent_loop.py (53 pass), and
python -m py_compile src/llm_core.py src/agent_loop.py.

2026-06-02 11:39:40 +09:00

Alexander Kenley

2c4b8b57dd

feat(ai): add OpenRouter and Ollama Cloud providers (#231 )

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>

2026-06-01 14:26:10 +09:00

3 Commits