Fix native tool-calling follow-up round on Gemini and Ollama (#867)

The agent's multi-round (tool-result) follow-up request was rejected with HTTP 400 on two providers, so tools ran but the agent never produced an answer: - OpenAI-compatible streaming (Gemini 3) dropped the per-call thought_signature and collided parallel tool calls, which arrive with index=None: they all landed in slot 0, overwriting the first call's name and corrupting its arguments by concatenation, so the follow-up request 400'd. Capture and replay each call's extra_content (thought_signature), and give every parallel call its own accumulator slot (allocated above the max key, so sparse or mixed indices can't collide). - Native Ollama /api/chat expects object tool-call arguments, but Odysseus carries them as a JSON string, which Ollama rejected ("Value looks like object, but can't find closing '}' symbol"). Convert them to objects in the Ollama payload builder. Both compose with the no-prose null-content sanitize fix from #862. Tested: python -m pytest tests/test_llm_core_streaming.py tests/test_llm_core_ollama.py tests/test_agent_loop.py (53 pass), and python -m py_compile src/llm_core.py src/agent_loop.py.
2026-06-02 04:39:40 +02:00
parent 54ac4a74fb
commit a327df6936
5 changed files with 334 additions and 3 deletions
--- a/src/agent_loop.py
+++ b/src/agent_loop.py
@@ -1123,6 +1123,11 @@ def _append_tool_results(
                    "name": tc.get("name", ""),
                    "arguments": tc.get("arguments", "{}"),
                },
+                # Gemini 3 requires the opaque thought_signature it returned with
+                # each function call to be echoed back on the follow-up turn, or
+                # the next request 400s. Replay it when present; other providers
+                # never emit it (their payload builders just ignore the field).
+                **({"extra_content": tc["extra_content"]} if tc.get("extra_content") else {}),
            }
            for j, tc in enumerate(native_tool_calls)
        ]