The agent's multi-round (tool-result) follow-up request was rejected with
HTTP 400 on two providers, so tools ran but the agent never produced an answer:
- OpenAI-compatible streaming (Gemini 3) dropped the per-call thought_signature
and collided parallel tool calls, which arrive with index=None: they all
landed in slot 0, overwriting the first call's name and corrupting its
arguments by concatenation, so the follow-up request 400'd. Capture and replay
each call's extra_content (thought_signature), and give every parallel call
its own accumulator slot (allocated above the max key, so sparse or mixed
indices can't collide).
- Native Ollama /api/chat expects object tool-call arguments, but Odysseus
carries them as a JSON string, which Ollama rejected ("Value looks like
object, but can't find closing '}' symbol"). Convert them to objects in the
Ollama payload builder.
Both compose with the no-prose null-content sanitize fix from #862.
Tested: python -m pytest tests/test_llm_core_streaming.py
tests/test_llm_core_ollama.py tests/test_agent_loop.py (53 pass), and
python -m py_compile src/llm_core.py src/agent_loop.py.
When an agent turn uses native (OpenAI-style) function calling and the model
returns only tool calls with no prose, _append_tool_results built the follow-up
assistant message with content "" (empty string).
Google Gemini's OpenAI-compatible endpoint and Ollama both reject an assistant
message that carries tool_calls alongside an empty-string content with HTTP 400.
Because that message feeds the tool results back to the model, every tool-using
turn on these providers dies at the second round: the tool runs, but the agent
never produces a result.
Use None (JSON null) instead, which is the spec-correct form the OpenAI SDK
itself emits and which OpenAI and Anthropic accept too. Adds tests covering the
native tool-call content shaping.